January 30, 2020

3175 words 15 mins read

Paper Group ANR 413

Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition. Black-Box Complexity of the Binary Value Function. Bridging the Gap between Training and Inference for Neural Machine Translation. Characterizing the impact of geometric properties of word embeddings on task performance. The Role of Interactivity in Local Differential Priv …

Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition


Title	Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition
Authors	Satoshi Tsutsui, Yanwei Fu, David Crandall
Abstract	One-shot fine-grained visual recognition often suffers from the problem of training data scarcity for new fine-grained classes. To alleviate this problem, an off-the-shelf image generator can be applied to synthesize additional training images, but these synthesized images are often not helpful for actually improving the accuracy of one-shot fine-grained recognition. This paper proposes a meta-learning framework to combine generated images with original images, so that the resulting ``hybrid’’ training images can improve one-shot learning. Specifically, the generic image generator is updated by a few training instances of novel classes, and a Meta Image Reinforcing Network (MetaIRNet) is proposed to conduct one-shot fine-grained recognition as well as image reinforcement. The model is trained in an end-to-end manner, and our experiments demonstrate consistent improvement over baselines on one-shot fine-grained image classification benchmarks. \|
Tasks	Fine-Grained Image Classification, Fine-Grained Visual Recognition, Image Classification, Meta-Learning, One-Shot Learning
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07164v1
PDF	https://arxiv.org/pdf/1911.07164v1.pdf
PWC	https://paperswithcode.com/paper/meta-reinforced-synthetic-data-for-one-shot-1
Repo
Framework

Black-Box Complexity of the Binary Value Function


Title	Black-Box Complexity of the Binary Value Function
Authors	Nina Bulanova, Maxim Buzdalov
Abstract	The binary value function, or BinVal, has appeared in several studies in theory of evolutionary computation as one of the extreme examples of linear pseudo-Boolean functions. Its unbiased black-box complexity was previously shown to be at most $\lceil \log_2 n \rceil + 2$, where $n$ is the problem size. We augment it with an upper bound of $\log_2 n + 2.42141558 - o(1)$, which is more precise for many values of $n$. We also present a lower bound of $\log_2 n + 1.1186406 - o(1)$. Additionally, we prove that BinVal is an easiest function among all unimodal pseudo-Boolean functions at least for unbiased algorithms.
Tasks
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04867v1
PDF	http://arxiv.org/pdf/1904.04867v1.pdf
PWC	https://paperswithcode.com/paper/black-box-complexity-of-the-binary-value
Repo
Framework

Bridging the Gap between Training and Inference for Neural Machine Translation


Title	Bridging the Gap between Training and Inference for Neural Machine Translation
Authors	Wen Zhang, Yang Feng, Fandong Meng, Di You, Qun Liu
Abstract	Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words. At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the way. Furthermore, word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations. In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. Experiment results on Chinese->English and WMT’14 English->German translation tasks demonstrate that our approach can achieve significant improvements on multiple datasets.
Tasks	Machine Translation
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02448v2
PDF	https://arxiv.org/pdf/1906.02448v2.pdf
PWC	https://paperswithcode.com/paper/bridging-the-gap-between-training-and
Repo
Framework

Characterizing the impact of geometric properties of word embeddings on task performance


Title	Characterizing the impact of geometric properties of word embeddings on task performance
Authors	Brendan Whitaker, Denis Newman-Griffis, Aparajita Haldar, Hakan Ferhatosmanoglu, Eric Fosler-Lussier
Abstract	Analysis of word embedding properties to inform their use in downstream NLP tasks has largely been studied by assessing nearest neighbors. However, geometric properties of the continuous feature space contribute directly to the use of embedding features in downstream models, and are largely unexplored. We consider four properties of word embedding geometry, namely: position relative to the origin, distribution of features in the vector space, global pairwise distances, and local pairwise distances. We define a sequence of transformations to generate new embeddings that expose subsets of these properties to downstream models and evaluate change in task performance to understand the contribution of each property to NLP models. We transform publicly available pretrained embeddings from three popular toolkits (word2vec, GloVe, and FastText) and evaluate on a variety of intrinsic tasks, which model linguistic information in the vector space, and extrinsic tasks, which use vectors as input to machine learning models. We find that intrinsic evaluations are highly sensitive to absolute position, while extrinsic tasks rely primarily on local similarity. Our findings suggest that future embedding models and post-processing techniques should focus primarily on similarity to nearby points in vector space.
Tasks	Word Embeddings
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04866v1
PDF	http://arxiv.org/pdf/1904.04866v1.pdf
PWC	https://paperswithcode.com/paper/characterizing-the-impact-of-geometric
Repo
Framework

The Role of Interactivity in Local Differential Privacy


Title	The Role of Interactivity in Local Differential Privacy
Authors	Matthew Joseph, Jieming Mao, Seth Neel, Aaron Roth
Abstract	We study the power of interactivity in local differential privacy. First, we focus on the difference between fully interactive and sequentially interactive protocols. Sequentially interactive protocols may query users adaptively in sequence, but they cannot return to previously queried users. The vast majority of existing lower bounds for local differential privacy apply only to sequentially interactive protocols, and before this paper it was not known whether fully interactive protocols were more powerful. We resolve this question. First, we classify locally private protocols by their compositionality, the multiplicative factor $k \geq 1$ by which the sum of a protocol’s single-round privacy parameters exceeds its overall privacy guarantee. We then show how to efficiently transform any fully interactive $k$-compositional protocol into an equivalent sequentially interactive protocol with an $O(k)$ blowup in sample complexity. Next, we show that our reduction is tight by exhibiting a family of problems such that for any $k$, there is a fully interactive $k$-compositional protocol which solves the problem, while no sequentially interactive protocol can solve the problem without at least an $\tilde \Omega(k)$ factor more examples. We then turn our attention to hypothesis testing problems. We show that for a large class of compound hypothesis testing problems — which include all simple hypothesis testing problems as a special case — a simple noninteractive test is optimal among the class of all (possibly fully interactive) tests.
Tasks
Published	2019-04-07
URL	https://arxiv.org/abs/1904.03564v2
PDF	https://arxiv.org/pdf/1904.03564v2.pdf
PWC	https://paperswithcode.com/paper/the-role-of-interactivity-in-local
Repo
Framework

A Survey on Distributed Machine Learning


Title	A Survey on Distributed Machine Learning
Authors	Joost Verbraeken, Matthijs Wolting, Jonathan Katzy, Jeroen Kloppenburg, Tim Verbelen, Jan S. Rellermeyer
Abstract	The demand for artificial intelligence has grown significantly over the last decade and this growth has been fueled by advances in machine learning techniques and the ability to leverage hardware acceleration. However, in order to increase the quality of predictions and render machine learning solutions feasible for more complex applications, a substantial amount of training data is required. Although small machine learning models can be trained with modest amounts of data, the input for training larger models such as neural networks grows exponentially with the number of parameters. Since the demand for processing training data has outpaced the increase in computation power of computing machinery, there is a need for distributing the machine learning workload across multiple machines, and turning the centralized into a distributed system. These distributed systems present new challenges, first and foremost the efficient parallelization of the training process and the creation of a coherent model. This article provides an extensive overview of the current state-of-the-art in the field by outlining the challenges and opportunities of distributed machine learning over conventional (centralized) machine learning, discussing the techniques used for distributed machine learning, and providing an overview of the systems that are available.
Tasks
Published	2019-12-20
URL	https://arxiv.org/abs/1912.09789v1
PDF	https://arxiv.org/pdf/1912.09789v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-distributed-machine-learning
Repo
Framework

An Artificial Intelligence approach to Shadow Rating


Title	An Artificial Intelligence approach to Shadow Rating
Authors	Angela Rita Provenzano, Daniele Trifirò, Nicola Jean, Giacomo Le Pera, Maurizio Spadaccino, Luca Massaron, Claudio Nordio
Abstract	We analyse the effectiveness of modern deep learning techniques in predicting credit ratings over a universe of thousands of global corporate entities obligations when compared to most popular, traditional machine-learning approaches such as linear models and tree-based classifiers. Our results show a adequate accuracy over different rating classes when applying categorical embeddings to artificial neural networks (ANN) architectures.
Tasks
Published	2019-12-20
URL	https://arxiv.org/abs/1912.09764v1
PDF	https://arxiv.org/pdf/1912.09764v1.pdf
PWC	https://paperswithcode.com/paper/an-artificial-intelligence-approach-to-shadow
Repo
Framework

Lessons from reinforcement learning for biological representations of space


Title	Lessons from reinforcement learning for biological representations of space
Authors	Alex Muryy, N. Siddharth, Nantas Nardelli, Andrew Glennerster, Philip H. S. Torr
Abstract	Neuroscientists postulate 3D representations in the brain in a variety of different coordinate frames (e.g. ‘head-centred’, ‘hand-centred’ and ‘world-based’). Recent advances in reinforcement learning demonstrate a quite different approach that may provide a more promising model for biological representations underlying spatial perception and navigation. In this paper, we focus on reinforcement learning methods that reward an agent for arriving at a target image without any attempt to build up a 3D ‘map’. We test the ability of this type of representation to support geometrically consistent spatial tasks, such as interpolating between learned locations, and compare its performance to that of a hand-crafted representation which has, by design, a high degree of geometric consistency. Our comparison of these two models demonstrates that it is advantageous to include information about the persistence of features as the camera translates (e.g. distant features persist). It is likely that non-Cartesian representations of this sort will be increasingly important in the search for robust models of human spatial perception and navigation.
Tasks
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06615v1
PDF	https://arxiv.org/pdf/1912.06615v1.pdf
PWC	https://paperswithcode.com/paper/lessons-from-reinforcement-learning-for
Repo
Framework

Exploring Information Centrality for Intrusion Detection in Large Networks


Title	Exploring Information Centrality for Intrusion Detection in Large Networks
Authors	Nidhi Rastogi
Abstract	Modern networked systems are constantly under threat from systemic attacks. There has been a massive upsurge in the number of devices connected to a network as well as the associated traffic volume. This has intensified the need to better understand all possible attack vectors during system design and implementation. Further, it has increased the need to mine large data sets, analyzing which has become a daunting task. It is critical to scale monitoring infrastructures to match this need, but a difficult goal for the small and medium organization. Hence, there is a need to propose novel approaches that address the big data problem in security. Information Centrality (IC) labels network nodes with better vantage points for detecting network-based anomalies as central nodes and uses them for detecting a category of attacks called systemic attacks. The main idea is that since these central nodes already see a lot of information flowing through the network, they are in a good position to detect anomalies before other nodes. This research first dives into the importance of using graphs in understanding the topology and information flow. We then introduce the usage of information centrality, a centrality-based index, to reduce data collection in existing communication networks. Using IC-identified central nodes can accelerate outlier detection when armed with a suitable anomaly detection technique. We also come up with a more efficient way to compute Information centrality for large networks. Finally, we demonstrate that central nodes detect anomalous behavior much faster than other non-central nodes, given the anomalous behavior is systemic in nature.
Tasks	Anomaly Detection, Intrusion Detection, Outlier Detection
Published	2019-04-27
URL	http://arxiv.org/abs/1904.12138v1
PDF	http://arxiv.org/pdf/1904.12138v1.pdf
PWC	https://paperswithcode.com/paper/exploring-information-centrality-for
Repo
Framework

Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology


Title	Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology
Authors	David Tellez, Geert Litjens, Peter Bandi, Wouter Bulten, John-Melle Bokhorst, Francesco Ciompi, Jeroen van der Laak
Abstract	Stain variation is a phenomenon observed when distinct pathology laboratories stain tissue slides that exhibit similar but not identical color appearance. Due to this color shift between laboratories, convolutional neural networks (CNNs) trained with images from one lab often underperform on unseen images from the other lab. Several techniques have been proposed to reduce the generalization error, mainly grouped into two categories: stain color augmentation and stain color normalization. The former simulates a wide variety of realistic stain variations during training, producing stain-invariant CNNs. The latter aims to match training and test color distributions in order to reduce stain variation. For the first time, we compared some of these techniques and quantified their effect on CNN classification performance using a heterogeneous dataset of hematoxylin and eosin histopathology images from 4 organs and 9 pathology laboratories. Additionally, we propose a novel unsupervised method to perform stain color normalization using a neural network. Based on our experimental results, we provide practical guidelines on how to use stain color augmentation and stain color normalization in future computational pathology applications.
Tasks	Data Augmentation
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06543v1
PDF	http://arxiv.org/pdf/1902.06543v1.pdf
PWC	https://paperswithcode.com/paper/quantifying-the-effects-of-data-augmentation
Repo
Framework

Linearly Converging Quasi Branch and Bound Algorithms for Global Rigid Registration


Title	Linearly Converging Quasi Branch and Bound Algorithms for Global Rigid Registration
Authors	Nadav Dym, Shahar Ziv Kovalsky
Abstract	In recent years, several branch-and-bound (BnB) algorithms have been proposed to globally optimize rigid registration problems. In this paper, we suggest a general framework to improve upon the BnB approach, which we name Quasi BnB. Quasi BnB replaces the linear lower bounds used in BnB algorithms with quadratic quasi-lower bounds which are based on the quadratic behavior of the energy in the vicinity of the global minimum. While quasi-lower bounds are not truly lower bounds, the Quasi-BnB algorithm is globally optimal. In fact we prove that it exhibits linear convergence – it achieves $\epsilon$-accuracy in $~O(\log(1/\epsilon)) $ time while the time complexity of other rigid registration BnB algorithms is polynomial in $1/\epsilon $. Our experiments verify that Quasi-BnB is significantly more efficient than state-of-the-art BnB algorithms, especially for problems where high accuracy is desired.
Tasks
Published	2019-04-03
URL	http://arxiv.org/abs/1904.02204v2
PDF	http://arxiv.org/pdf/1904.02204v2.pdf
PWC	https://paperswithcode.com/paper/linearly-converging-quasi-branch-and-bound
Repo
Framework

SEMEDA: Enhancing Segmentation Precision with Semantic Edge Aware Loss


Title	SEMEDA: Enhancing Segmentation Precision with Semantic Edge Aware Loss
Authors	Yifu Chen, Arnaud Dapogny, Matthieu Cord
Abstract	While nowadays deep neural networks achieve impressive performances on semantic segmentation tasks, they are usually trained by optimizing pixel-wise losses such as cross-entropy. As a result, the predictions outputted by such networks usually struggle to accurately capture the object boundaries and exhibit holes inside the objects. In this paper, we propose a novel approach to improve the structure of the predicted segmentation masks. We introduce a novel semantic edge detection network, which allows to match the predicted and ground truth segmentation masks. This Semantic Edge-Aware strategy (SEMEDA) can be combined with any backbone deep network in an end-to-end training framework. Through thorough experimental validation on Pascal VOC 2012 and Cityscapes datasets, we show that the proposed SEMEDA approach enhances the structure of the predicted segmentation masks by enforcing sharp boundaries and avoiding discontinuities inside objects, improving the segmentation performance. In addition, our semantic edge-aware loss can be integrated into any popular segmentation network without requiring any additional annotation and with negligible computational load, as compared to standard pixel-wise cross-entropy loss.
Tasks	Edge Detection, Semantic Segmentation
Published	2019-05-06
URL	https://arxiv.org/abs/1905.01892v1
PDF	https://arxiv.org/pdf/1905.01892v1.pdf
PWC	https://paperswithcode.com/paper/190501892
Repo
Framework

Learning abstract perceptual notions: the example of space


Title	Learning abstract perceptual notions: the example of space
Authors	Alexander V. Terekhov, J. Kevin O’Regan
Abstract	Humans are extremely swift learners. We are able to grasp highly abstract notions, whether they come from art perception or pure mathematics. Current machine learning techniques demonstrate astonishing results in extracting patterns in information. Yet the abstract notions we possess are more than just statistical patterns in the incoming information. Sensorimotor theory suggests that they represent functions, laws, describing how the information can be transformed, or, in other words, they represent the statistics of sensorimotor changes rather than sensory inputs themselves. The aim of our work is to suggest a way for machine learning and sensorimotor theory to benefit from each other so as to pave the way toward new horizons in learning. We show in this study that a highly abstract notion, that of space, can be seen as a collection of laws of transformations of sensory information and that these laws could in theory be learned by a naive agent. As an illustration we do a one-dimensional simulation in which an agent extracts spatial knowledge in the form of internalized (“sensible”) rigid displacements. The agent uses them to encode its own displacements in a way which is isometrically related to external space. Though the algorithm allowing acquisition of rigid displacements is designed \emph{ad hoc}, we believe it can stimulate the development of unsupervised learning techniques leading to similar results.
Tasks
Published	2019-07-24
URL	https://arxiv.org/abs/1907.12430v1
PDF	https://arxiv.org/pdf/1907.12430v1.pdf
PWC	https://paperswithcode.com/paper/learning-abstract-perceptual-notions-the
Repo
Framework

Machine Learning to Predict Developmental Neurotoxicity with High-throughput Data from 2D Bio-engineered Tissues


Title	Machine Learning to Predict Developmental Neurotoxicity with High-throughput Data from 2D Bio-engineered Tissues
Authors	Finn Kuusisto, Vitor Santos Costa, Zhonggang Hou, James Thomson, David Page, Ron Stewart
Abstract	There is a growing need for fast and accurate methods for testing developmental neurotoxicity across several chemical exposure sources. Current approaches, such as in vivo animal studies, and assays of animal and human primary cell cultures, suffer from challenges related to time, cost, and applicability to human physiology. We previously demonstrated success employing machine learning to predict developmental neurotoxicity using gene expression data collected from human 3D tissue models exposed to various compounds. The 3D model is biologically similar to developing neural structures, but its complexity necessitates extensive expertise and effort to employ. By instead focusing solely on constructing an assay of developmental neurotoxicity, we propose that a simpler 2D tissue model may prove sufficient. We thus compare the accuracy of predictive models trained on data from a 2D tissue model with those trained on data from a 3D tissue model, and find the 2D model to be substantially more accurate. Furthermore, we find the 2D model to be more robust under stringent gene set selection, whereas the 3D model suffers substantial accuracy degradation. While both approaches have advantages and disadvantages, we propose that our described 2D approach could be a valuable tool for decision makers when prioritizing neurotoxicity screening.
Tasks
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02121v1
PDF	https://arxiv.org/pdf/1905.02121v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-to-predict-developmental
Repo
Framework

Real-time Background-aware 3D Textureless Object Pose Estimation


Title	Real-time Background-aware 3D Textureless Object Pose Estimation
Authors	Mang Shao, Danhang Tang, Tae-Kyun Kim
Abstract	In this work, we present a modified fuzzy decision forest for real-time 3D object pose estimation based on typical template representation. We employ an extra preemptive background rejector node in the decision forest framework to terminate the examination of background locations as early as possible, result in a significantly improvement on efficiency. Our approach is also scalable to large dataset since the tree structure naturally provides a logarithm time complexity to the number of objects. Finally we further reduce the validation stage with a fast breadth-first scheme. The results show that our approach outperform the state-of-the-arts on the efficiency while maintaining a comparable accuracy.
Tasks	Pose Estimation
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09128v1
PDF	https://arxiv.org/pdf/1907.09128v1.pdf
PWC	https://paperswithcode.com/paper/real-time-background-aware-3d-textureless
Repo
Framework