July 30, 2019

2808 words 14 mins read

Paper Group AWR 58

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow. Machine Learning by Unitary Tensor Network of Hierarchical Tree Structure. Revisiting Unsupervised Learning for Defect Prediction. Learning Chinese Word Representations From Glyphs Of Characters. Query-Based Abstractive Summarization Using Neural Networks. Active Lea …

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow


Title	Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow
Authors	Francesco Giannini, Vincenzo Laveglia, Alessandro Rossi, Dario Zanca, Andrea Zugarini
Abstract	This report provides an introduction to some Machine Learning tools within the most common development environments. It mainly focuses on practical problems, skipping any theoretical introduction. It is oriented to both students trying to approach Machine Learning and experts looking for new frameworks.
Tasks
Published	2017-03-10
URL	http://arxiv.org/abs/1703.05298v2
PDF	http://arxiv.org/pdf/1703.05298v2.pdf
PWC	https://paperswithcode.com/paper/neural-networks-for-beginners-a-fast
Repo	https://github.com/AILabUSiena/NeuralNetworksForBeginners
Framework	tf

Machine Learning by Unitary Tensor Network of Hierarchical Tree Structure


Title	Machine Learning by Unitary Tensor Network of Hierarchical Tree Structure
Authors	Ding Liu, Shi-Ju Ran, Peter Wittek, Cheng Peng, Raul Blázquez García, Gang Su, Maciej Lewenstein
Abstract	The resemblance between the methods used in quantum-many body physics and in machine learning has drawn considerable attention. In particular, tensor networks (TNs) and deep learning architectures bear striking similarities to the extent that TNs can be used for machine learning. Previous results used one-dimensional TNs in image recognition, showing limited scalability and flexibilities. In this work, we train two-dimensional hierarchical TNs to solve image recognition problems, using a training algorithm derived from the multi-scale entanglement renormalization ansatz. This approach introduces mathematical connections among quantum many-body physics, quantum information theory, and machine learning. While keeping the TN unitary in the training phase, TN states are defined, which encode classes of images into quantum many-body states. We study the quantum features of the TN states, including quantum entanglement and fidelity. We find these quantities could be properties that characterize the image classes, as well as the machine learning tasks.
Tasks	Tensor Networks
Published	2017-10-13
URL	http://arxiv.org/abs/1710.04833v4
PDF	http://arxiv.org/pdf/1710.04833v4.pdf
PWC	https://paperswithcode.com/paper/machine-learning-by-two-dimensional
Repo	https://github.com/RaulBz/Master_Thesis
Framework	none

Revisiting Unsupervised Learning for Defect Prediction


Title	Revisiting Unsupervised Learning for Defect Prediction
Authors	Wei Fu, Tim Menzies
Abstract	Collecting quality data from software projects can be time-consuming and expensive. Hence, some researchers explore “unsupervised” approaches to quality prediction that does not require labelled data. An alternate technique is to use “supervised” approaches that learn models from project data labelled with, say, “defective” or “not-defective”. Most researchers use these supervised models since, it is argued, they can exploit more knowledge of the projects. At FSE’16, Yang et al. reported startling results where unsupervised defect predictors outperformed supervised predictors for effort-aware just-in-time defect prediction. If confirmed, these results would lead to a dramatic simplification of a seemingly complex task (data mining) that is widely explored in the software engineering literature. This paper repeats and refutes those results as follows. (1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away. (2)Their findings were grouped across $N$ projects. When we repeat their analysis on a project-by-project basis, supervised predictors are seen to work better. Even though this paper rejects the specific conclusions of Yang et al., we still endorse their general goal. In our our experiments, supervised predictors did not perform outstandingly better than unsupervised ones for effort-aware just-in-time defect prediction. Hence, they may indeed be some combination of unsupervised learners to achieve comparable performance to supervised ones. We therefore encourage others to work in this promising area.
Tasks
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00132v2
PDF	http://arxiv.org/pdf/1703.00132v2.pdf
PWC	https://paperswithcode.com/paper/revisiting-unsupervised-learning-for-defect
Repo	https://github.com/WeiFoo/RevisitUnsupervised
Framework	none

Learning Chinese Word Representations From Glyphs Of Characters


Title	Learning Chinese Word Representations From Glyphs Of Characters
Authors	Tzu-Ray Su, Hung-Yi Lee
Abstract	In this paper, we propose new methods to learn Chinese word representations. Chinese characters are composed of graphical components, which carry rich semantics. It is common for a Chinese learner to comprehend the meaning of a word from these graphical components. As a result, we propose models that enhance word representations by character glyphs. The character glyph features are directly learned from the bitmaps of characters by convolutional auto-encoder(convAE), and the glyph features improve Chinese word representations which are already enhanced by character embeddings. Another contribution in this paper is that we created several evaluation datasets in traditional Chinese and made them public.
Tasks
Published	2017-08-16
URL	http://arxiv.org/abs/1708.04755v1
PDF	http://arxiv.org/pdf/1708.04755v1.pdf
PWC	https://paperswithcode.com/paper/learning-chinese-word-representations-from
Repo	https://github.com/ray1007/GWE
Framework	tf

Query-Based Abstractive Summarization Using Neural Networks


Title	Query-Based Abstractive Summarization Using Neural Networks
Authors	Johan Hasselqvist, Niklas Helmertz, Mikael Kågebäck
Abstract	In this paper, we present a model for generating summaries of text documents with respect to a query. This is known as query-based summarization. We adapt an existing dataset of news article summaries for the task and train a pointer-generator model using this dataset. The generated summaries are evaluated by measuring similarity to reference summaries. Our results show that a neural network summarization model, similar to existing neural network models for abstractive summarization, can be constructed to make use of queries to produce targeted summaries.
Tasks	Abstractive Text Summarization
Published	2017-12-17
URL	http://arxiv.org/abs/1712.06100v1
PDF	http://arxiv.org/pdf/1712.06100v1.pdf
PWC	https://paperswithcode.com/paper/query-based-abstractive-summarization-using
Repo	https://github.com/helmertz/querysum
Framework	tf

Active Learning for Visual Question Answering: An Empirical Study


Title	Active Learning for Visual Question Answering: An Empirical Study
Authors	Xiao Lin, Devi Parikh
Abstract	We present an empirical study of active learning for Visual Question Answering, where a deep VQA model selects informative question-image pairs from a pool and queries an oracle for answers to maximally improve its performance under a limited query budget. Drawing analogies from human learning, we explore cramming (entropy), curiosity-driven (expected model change), and goal-driven (expected error reduction) active learning approaches, and propose a fast and effective goal-driven active learning scoring function to pick question-image pairs for deep VQA models under the Bayesian Neural Network framework. We find that deep VQA models need large amounts of training data before they can start asking informative questions. But once they do, all three approaches outperform the random selection baseline and achieve significant query savings. For the scenario where the model is allowed to ask generic questions about images but is evaluated only on specific questions (e.g., questions whose answer is either yes or no), our proposed goal-driven scoring function performs the best.
Tasks	Active Learning, Visual Question Answering
Published	2017-11-06
URL	http://arxiv.org/abs/1711.01732v1
PDF	http://arxiv.org/pdf/1711.01732v1.pdf
PWC	https://paperswithcode.com/paper/active-learning-for-visual-question-answering
Repo	https://github.com/frkl/active-learning
Framework	torch

Learning to Bid Without Knowing your Value


Title	Learning to Bid Without Knowing your Value
Authors	Zhe Feng, Chara Podimata, Vasilis Syrgkanis
Abstract	We address online learning in complex auction settings, such as sponsored search auctions, where the value of the bidder is unknown to her, evolving in an arbitrary manner and observed only if the bidder wins an allocation. We leverage the structure of the utility of the bidder and the partial feedback that bidders typically receive in auctions, in order to provide algorithms with regret rates against the best fixed bid in hindsight, that are exponentially faster in convergence in terms of dependence on the action space, than what would have been derived by applying a generic bandit algorithm and almost equivalent to what would have been achieved in the full information setting. Our results are enabled by analyzing a new online learning setting with outcome-based feedback, which generalizes learning with feedback graphs. We provide an online learning algorithm for this setting, of independent interest, with regret that grows only logarithmically with the number of actions and linearly only in the number of potential outcomes (the latter being very small in most auction settings). Last but not least, we show that our algorithm outperforms the bandit approach experimentally and that this performance is robust to dropping some of our theoretical assumptions or introducing noise in the feedback that the bidder receives.
Tasks
Published	2017-11-03
URL	http://arxiv.org/abs/1711.01333v5
PDF	http://arxiv.org/pdf/1711.01333v5.pdf
PWC	https://paperswithcode.com/paper/learning-to-bid-without-knowing-your-value
Repo	https://github.com/zfengharvard/bandit-sponsored-search
Framework	none

Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German


Title	Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
Authors	Pierre-Edouard Honnet, Andrei Popescu-Belis, Claudiu Musat, Michael Baeriswyl
Abstract	The goal of this work is to design a machine translation (MT) system for a low-resource family of dialects, collectively known as Swiss German, which are widely spoken in Switzerland but seldom written. We collected a significant number of parallel written resources to start with, up to a total of about 60k words. Moreover, we identified several other promising data sources for Swiss German. Then, we designed and compared three strategies for normalizing Swiss German input in order to address the regional diversity. We found that character-based neural MT was the best solution for text normalization. In combination with phrase-based statistical MT, our solution reached 36% BLEU score when translating from the Bernese dialect. This value, however, decreases as the testing data becomes more remote from the training one, geographically and topically. These resources and normalization techniques are a first step towards full MT of Swiss German dialects.
Tasks	Machine Translation
Published	2017-10-30
URL	http://arxiv.org/abs/1710.11035v2
PDF	http://arxiv.org/pdf/1710.11035v2.pdf
PWC	https://paperswithcode.com/paper/machine-translation-of-low-resource-spoken
Repo	https://github.com/Kyubyong/quasi-rnn
Framework	tf

Fast Genetic Algorithms


Title	Fast Genetic Algorithms
Authors	Benjamin Doerr, Huu Phuoc Le, Régis Makhmara, Ta Duy Nguyen
Abstract	For genetic algorithms using a bit-string representation of length~$n$, the general recommendation is to take $1/n$ as mutation rate. In this work, we discuss whether this is really justified for multimodal functions. Taking jump functions and the $(1+1)$ evolutionary algorithm as the simplest example, we observe that larger mutation rates give significantly better runtimes. For the $\jump_{m,n}$ function, any mutation rate between $2/n$ and $m/n$ leads to a speed-up at least exponential in $m$ compared to the standard choice. The asymptotically best runtime, obtained from using the mutation rate $m/n$ and leading to a speed-up super-exponential in $m$, is very sensitive to small changes of the mutation rate. Any deviation by a small $(1 \pm \eps)$ factor leads to a slow-down exponential in $m$. Consequently, any fixed mutation rate gives strongly sub-optimal results for most jump functions. Building on this observation, we propose to use a random mutation rate $\alpha/n$, where $\alpha$ is chosen from a power-law distribution. We prove that the $(1+1)$ EA with this heavy-tailed mutation rate optimizes any $\jump_{m,n}$ function in a time that is only a small polynomial (in~$m$) factor above the one stemming from the optimal rate for this $m$. Our heavy-tailed mutation operator yields similar speed-ups (over the best known performance guarantees) for the vertex cover problem in bipartite graphs and the matching problem in general graphs. Following the example of fast simulated annealing, fast evolution strategies, and fast evolutionary programming, we propose to call genetic algorithms using a heavy-tailed mutation operator \emph{fast genetic algorithms}.
Tasks
Published	2017-03-09
URL	http://arxiv.org/abs/1703.03334v2
PDF	http://arxiv.org/pdf/1703.03334v2.pdf
PWC	https://paperswithcode.com/paper/fast-genetic-algorithms
Repo	https://github.com/rafalpronko/tsp-kaggle
Framework	none

Fruit recognition from images using deep learning


Title	Fruit recognition from images using deep learning
Authors	Horea Mureşan, Mihai Oltean
Abstract	In this paper we introduce a new, high-quality, dataset of images containing fruits. We also present the results of some numerical experiment for training a neural network to detect fruits. We discuss the reason why we chose to use fruits in this project by proposing a few applications that could use this kind of neural network.
Tasks
Published	2017-12-02
URL	http://arxiv.org/abs/1712.00580v9
PDF	http://arxiv.org/pdf/1712.00580v9.pdf
PWC	https://paperswithcode.com/paper/fruit-recognition-from-images-using-deep
Repo	https://github.com/applecrazy/FruitClassifier
Framework	tf

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection


Title	Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
Authors	Debidatta Dwibedi, Ishan Misra, Martial Hebert
Abstract	A major impediment in rapidly deploying object detection models for instance detection is the lack of large annotated datasets. For example, finding a large labeled dataset containing instances in a particular kitchen is unlikely. Each new environment with new instances requires expensive data collection and annotation. In this paper, we propose a simple approach to generate large annotated instance datasets with minimal effort. Our key insight is that ensuring only patch-level realism provides enough training signal for current object detector models. We automatically `cut' object instances and` paste’ them on random backgrounds. A naive way to do this results in pixel artifacts which result in poor performance for trained models. We show how to make detectors ignore these artifacts during training and generate data that gives competitive performance on real data. Our method outperforms existing synthesis approaches and when combined with real images improves relative performance by more than 21% on benchmark datasets. In a cross-domain setting, our synthetic data combined with just 10% real data outperforms models trained on all real data.
Tasks	Object Detection
Published	2017-08-04
URL	http://arxiv.org/abs/1708.01642v1
PDF	http://arxiv.org/pdf/1708.01642v1.pdf
PWC	https://paperswithcode.com/paper/cut-paste-and-learn-surprisingly-easy
Repo	https://github.com/debidatta/syndata-generation
Framework	none

Recurrent Additive Networks


Title	Recurrent Additive Networks
Authors	Kenton Lee, Omer Levy, Luke Zettlemoyer
Abstract	We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates. At every time step, the new state is computed as a gated component-wise sum of the input and the previous state, without any of the non-linearities commonly used in RNN transition dynamics. We formally show that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums. Despite this relatively simple functional form, experiments demonstrate that RANs perform on par with LSTMs on benchmark language modeling problems. This result shows that many of the non-linear computations in LSTMs and related networks are not essential, at least for the problems we consider, and suggests that the gates are doing more of the computational work than previously understood.
Tasks	Language Modelling
Published	2017-05-21
URL	http://arxiv.org/abs/1705.07393v2
PDF	http://arxiv.org/pdf/1705.07393v2.pdf
PWC	https://paperswithcode.com/paper/recurrent-additive-networks
Repo	https://github.com/kentonl/ran
Framework	tf

Quasi-Oracle Estimation of Heterogeneous Treatment Effects


Title	Quasi-Oracle Estimation of Heterogeneous Treatment Effects
Authors	Xinkun Nie, Stefan Wager
Abstract	Flexible estimation of heterogeneous treatment effects lies at the heart of many statistical challenges, such as personalized medicine and optimal resource allocation. In this paper, we develop a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies. We first estimate marginal effects and treatment propensities in order to form an objective function that isolates the causal component of the signal. Then, we optimize this data-adaptive objective function. Our approach has several advantages over existing methods. From a practical perspective, our method is flexible and easy to use: In both steps, we can use any loss-minimization method, e.g., penalized regression, deep neutral networks, or boosting; moreover, these methods can be fine-tuned by cross validation. Meanwhile, in the case of penalized kernel regression, we show that our method has a quasi-oracle property: Even if the pilot estimates for marginal effects and treatment propensities are not particularly accurate, we achieve the same error bounds as an oracle who has a priori knowledge of these two nuisance components. We implement variants of our approach based on both penalized regression and boosting in a variety of simulation setups, and find promising performance relative to existing baselines.
Tasks
Published	2017-12-13
URL	http://arxiv.org/abs/1712.04912v3
PDF	http://arxiv.org/pdf/1712.04912v3.pdf
PWC	https://paperswithcode.com/paper/quasi-oracle-estimation-of-heterogeneous
Repo	https://github.com/xnie/rlearner
Framework	none

Question Dependent Recurrent Entity Network for Question Answering


Title	Question Dependent Recurrent Entity Network for Question Answering
Authors	Andrea Madotto, Giuseppe Attardi
Abstract	Question Answering is a task which requires building models capable of providing answers to questions expressed in human language. Full question answering involves some form of reasoning ability. We introduce a neural network architecture for this task, which is a form of $Memory\ Network$, that recognizes entities and their relations to answers through a focus attention mechanism. Our model is named $Question\ Dependent\ Recurrent\ Entity\ Network$ and extends $Recurrent\ Entity\ Network$ by exploiting aspects of the question during the memorization process. We validate the model on both synthetic and real datasets: the $bAbI$ question answering dataset and the $CNN\ &\ Daily\ News$ $reading\ comprehension$ dataset. In our experiments, the models achieved a State-of-The-Art in the former and competitive results in the latter.
Tasks	Question Answering, Reading Comprehension
Published	2017-07-25
URL	http://arxiv.org/abs/1707.07922v2
PDF	http://arxiv.org/pdf/1707.07922v2.pdf
PWC	https://paperswithcode.com/paper/question-dependent-recurrent-entity-network
Repo	https://github.com/andreamad8/QDREN
Framework	tf

Indirect Image Registration with Large Diffeomorphic Deformations


Title	Indirect Image Registration with Large Diffeomorphic Deformations
Authors	Chong Chen, Ozan Öktem
Abstract	The paper adapts the large deformation diffeomorphic metric mapping framework for image registration to the indirect setting where a template is registered against a target that is given through indirect noisy observations. The registration uses diffeomorphisms that transform the template through a (group) action. These diffeomorphisms are generated by solving a flow equation that is defined by a velocity field with certain regularity. The theoretical analysis includes a proof that indirect image registration has solutions (existence) that are stable and that converge as the data error tends so zero, so it becomes a well-defined regularization method. The paper concludes with examples of indirect image registration in 2D tomography with very sparse and/or highly noisy data.
Tasks	Image Registration
Published	2017-06-13
URL	http://arxiv.org/abs/1706.04048v3
PDF	http://arxiv.org/pdf/1706.04048v3.pdf
PWC	https://paperswithcode.com/paper/indirect-image-registration-with-large
Repo	https://github.com/bgris/odl
Framework	none