October 19, 2019

2959 words 14 mins read

Paper Group ANR 372

Discovering General-Purpose Active Learning Strategies. Assessing fish abundance from underwater video using deep neural networks. Biologically-plausible learning algorithms can scale to large datasets. Automatic Chord Recognition with Higher-Order Harmonic Language Modelling. Cross-domain Human Parsing via Adversarial Feature and Label Adaptation. …

Discovering General-Purpose Active Learning Strategies


Title	Discovering General-Purpose Active Learning Strategies
Authors	Ksenia Konyushkova, Raphael Sznitman, Pascal Fua
Abstract	We propose a general-purpose approach to discovering active learning (AL) strategies from data. These strategies are transferable from one domain to another and can be used in conjunction with many machine learning models. To this end, we formalize the annotation process as a Markov decision process, design universal state and action spaces and introduce a new reward function that precisely model the AL objective of minimizing the annotation cost. We seek to find an optimal (non-myopic) AL strategy using reinforcement learning. We evaluate the learned strategies on multiple unrelated domains and show that they consistently outperform state-of-the-art baselines.
Tasks	Active Learning
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04114v2
PDF	http://arxiv.org/pdf/1810.04114v2.pdf
PWC	https://paperswithcode.com/paper/discovering-general-purpose-active-learning
Repo
Framework

Assessing fish abundance from underwater video using deep neural networks


Title	Assessing fish abundance from underwater video using deep neural networks
Authors	Ranju Mandal, Rod M. Connolly, Thomas A. Schlacherz, Bela Stantic
Abstract	Uses of underwater videos to assess diversity and abundance of fish are being rapidly adopted by marine biologists. Manual processing of videos for quantification by human analysts is time and labour intensive. Automatic processing of videos can be employed to achieve the objectives in a cost and time-efficient way. The aim is to build an accurate and reliable fish detection and recognition system, which is important for an autonomous robotic platform. However, there are many challenges involved in this task (e.g. complex background, deformation, low resolution and light propagation). Recent advancement in the deep neural network has led to the development of object detection and recognition in real time scenarios. An end-to-end deep learning-based architecture is introduced which outperformed the state of the art methods and first of its kind on fish assessment task. A Region Proposal Network (RPN) introduced by an object detector termed as Faster R-CNN was combined with three classification networks for detection and recognition of fish species obtained from Remote Underwater Video Stations (RUVS). An accuracy of 82.4% (mAP) obtained from the experiments are much higher than previously proposed methods.
Tasks	Fish Detection, Object Detection
Published	2018-07-16
URL	http://arxiv.org/abs/1807.05838v1
PDF	http://arxiv.org/pdf/1807.05838v1.pdf
PWC	https://paperswithcode.com/paper/assessing-fish-abundance-from-underwater
Repo
Framework

Biologically-plausible learning algorithms can scale to large datasets


Title	Biologically-plausible learning algorithms can scale to large datasets
Authors	Will Xiao, Honglin Chen, Qianli Liao, Tomaso Poggio
Abstract	The backpropagation (BP) algorithm is often thought to be biologically implausible in the brain. One of the main reasons is that BP requires symmetric weight matrices in the feedforward and feedback pathways. To address this “weight transport problem” (Grossberg, 1987), two more biologically plausible algorithms, proposed by Liao et al. (2016) and Lillicrap et al. (2016), relax BP’s weight symmetry requirements and demonstrate comparable learning capabilities to that of BP on small datasets. However, a recent study by Bartunov et al. (2018) evaluate variants of target-propagation (TP) and feedback alignment (FA) on MINIST, CIFAR, and ImageNet datasets, and find that although many of the proposed algorithms perform well on MNIST and CIFAR, they perform significantly worse than BP on ImageNet. Here, we additionally evaluate the sign-symmetry algorithm (Liao et al., 2016), which differs from both BP and FA in that the feedback and feedforward weights share signs but not magnitudes. We examine the performance of sign-symmetry and feedback alignment on ImageNet and MS COCO datasets using different network architectures (ResNet-18 and AlexNet for ImageNet, RetinaNet for MS COCO). Surprisingly, networks trained with sign-symmetry can attain classification performance approaching that of BP-trained networks. These results complement the study by Bartunov et al. (2018), and establish a new benchmark for future biologically plausible learning algorithms on more difficult datasets and more complex architectures.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03567v3
PDF	http://arxiv.org/pdf/1811.03567v3.pdf
PWC	https://paperswithcode.com/paper/biologically-plausible-learning-algorithms
Repo
Framework

Automatic Chord Recognition with Higher-Order Harmonic Language Modelling


Title	Automatic Chord Recognition with Higher-Order Harmonic Language Modelling
Authors	Filip Korzeniowski, Gerhard Widmer
Abstract	Common temporal models for automatic chord recognition model chord changes on a frame-wise basis. Due to this fact, they are unable to capture musical knowledge about chord progressions. In this paper, we propose a temporal model that enables explicit modelling of chord changes and durations. We then apply N-gram models and a neural-network-based acoustic model within this framework, and evaluate the effect of model overconfidence. Our results show that model overconfidence plays only a minor role (but target smoothing still improves the acoustic model), and that stronger chord language models do improve recognition results, however their effects are small compared to other domains.
Tasks	Chord Recognition, Language Modelling
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05341v1
PDF	http://arxiv.org/pdf/1808.05341v1.pdf
PWC	https://paperswithcode.com/paper/automatic-chord-recognition-with-higher-order
Repo
Framework

Cross-domain Human Parsing via Adversarial Feature and Label Adaptation


Title	Cross-domain Human Parsing via Adversarial Feature and Label Adaptation
Authors	Si Liu, Yao Sun, Defa Zhu, Guanghui Ren, Yu Chen, Jiashi Feng, Jizhong Han
Abstract	Human parsing has been extensively studied recently due to its wide applications in many important scenarios. Mainstream fashion parsing models focus on parsing the high-resolution and clean images. However, directly applying the parsers trained on benchmarks to a particular application scenario in the wild, e.g., a canteen, airport or workplace, often gives non-satisfactory performance due to domain shift. In this paper, we explore a new and challenging cross-domain human parsing problem: taking the benchmark dataset with extensive pixel-wise labeling as the source domain, how to obtain a satisfactory parser on a new target domain without requiring any additional manual labeling? To this end, we propose a novel and efficient cross-domain human parsing model to bridge the cross-domain differences in terms of visual appearance and environment conditions and fully exploit commonalities across domains. Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences. A discriminative feature adversarial network is introduced to supervise the feature compensation to effectively reduce the discrepancy between feature distributions of two domains. Besides, our model also introduces a structured label adversarial network to guide the parsing results of the target domain to follow the high-order relationships of the structured labels shared across domains. The proposed framework is end-to-end trainable, practical and scalable in real applications. Extensive experiments are conducted where LIP dataset is the source domain and 4 different datasets including surveillance videos, movies and runway shows are evaluated as target domains. The results consistently confirm data efficiency and performance advantages of the proposed method for the cross-domain human parsing problem.
Tasks	Human Parsing
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01260v2
PDF	http://arxiv.org/pdf/1801.01260v2.pdf
PWC	https://paperswithcode.com/paper/cross-domain-human-parsing-via-adversarial
Repo
Framework

Nonnegative PARAFAC2: a flexible coupling approach


Title	Nonnegative PARAFAC2: a flexible coupling approach
Authors	Jeremy E. Cohen, Rasmus Bro
Abstract	Modeling variability in tensor decomposition methods is one of the challenges of source separation. One possible solution to account for variations from one data set to another, jointly analysed, is to resort to the PARAFAC2 model. However, so far imposing constraints on the mode with variability has not been possible. In the following manuscript, a relaxation of the PARAFAC2 model is introduced, that allows for imposing nonnegativity constraints on the varying mode. An algorithm to compute the proposed flexible PARAFAC2 model is derived, and its performance is studied on both synthetic and chemometrics data.
Tasks
Published	2018-02-14
URL	http://arxiv.org/abs/1802.05035v1
PDF	http://arxiv.org/pdf/1802.05035v1.pdf
PWC	https://paperswithcode.com/paper/nonnegative-parafac2-a-flexible-coupling
Repo
Framework

Handwritten Digit Recognition by Elastic Matching


Title	Handwritten Digit Recognition by Elastic Matching
Authors	Sagnik Majumder, C. von der Malsburg, Aashish Richhariya, Surekha Bhanot
Abstract	A simple model of MNIST handwritten digit recognition is presented here. The model is an adaptation of a previous theory of face recognition. It realizes translation and rotation invariance in a principled way instead of being based on extensive learning from large masses of sample data. The presented recognition rates fall short of other publications, but due to its inspectability and conceptual and numerical simplicity, our system commends itself as a basis for further development.
Tasks	Face Recognition, Handwritten Digit Recognition
Published	2018-07-24
URL	http://arxiv.org/abs/1807.09324v1
PDF	http://arxiv.org/pdf/1807.09324v1.pdf
PWC	https://paperswithcode.com/paper/handwritten-digit-recognition-by-elastic
Repo
Framework

2^B3^C: 2 Box 3 Crop of Facial Image for Gender Classification with Convolutional Networks


Title	2^B3^C: 2 Box 3 Crop of Facial Image for Gender Classification with Convolutional Networks
Authors	Vandit Gajjar
Abstract	In this paper, we tackle the classification of gender in facial images with deep learning. Our convolutional neural networks (CNN) use the VGG-16 architecture [1] and are pretrained on ImageNet for image classification. Our proposed method (2^B3^C) first detects the face in the facial image, increases the margin of a detected face by 50%, cropping the face with two boxes three crop schemes (Left, Middle, and Right crop) and extracts the CNN predictions on the cropped schemes. The CNNs of our method is fine-tuned on the Adience and LFW with gender annotations. We show the effectiveness of our method by achieving 90.8% classification on Adience and achieving competitive 95.3% classification accuracy on LFW dataset. In addition, to check the true ability of our method, our gender classification system has a frame rate of 7-10 fps (frames per seconds) on a GPU considering real-time scenarios.
Tasks	Image Classification
Published	2018-03-05
URL	http://arxiv.org/abs/1803.02181v1
PDF	http://arxiv.org/pdf/1803.02181v1.pdf
PWC	https://paperswithcode.com/paper/2b3c-2-box-3-crop-of-facial-image-for-gender
Repo
Framework

UAlacant machine translation quality estimation at WMT 2018: a simple approach using phrase tables and feed-forward neural networks


Title	UAlacant machine translation quality estimation at WMT 2018: a simple approach using phrase tables and feed-forward neural networks
Authors	Miquel Esplà-Gomis, Felipe Sánchez-Martínez, Mikel L. Forcada
Abstract	We describe the Universitat d’Alacant submissions to the word- and sentence-level machine translation (MT) quality estimation (QE) shared task at WMT 2018. Our approach to word-level MT QE builds on previous work to mark the words in the machine-translated sentence as \textit{OK} or \textit{BAD}, and is extended to determine if a word or sequence of words need to be inserted in the gap after each word. Our sentence-level submission simply uses the edit operations predicted by the word-level approach to approximate TER. The method presented ranked first in the sub-task of identifying insertions in gaps for three out of the six datasets, and second in the rest of them.
Tasks	Machine Translation
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02510v1
PDF	http://arxiv.org/pdf/1811.02510v1.pdf
PWC	https://paperswithcode.com/paper/ualacant-machine-translation-quality
Repo
Framework

Autonomous Deep Learning: A Genetic DCNN Designer for Image Classification


Title	Autonomous Deep Learning: A Genetic DCNN Designer for Image Classification
Authors	Benteng Ma, Yong Xia
Abstract	Recent years have witnessed the breakthrough success of deep convolutional neural networks (DCNNs) in image classification and other vision applications. Although freeing users from the troublesome handcrafted feature extraction by providing a uniform feature extraction-classification framework, DCNNs still require a handcrafted design of their architectures. In this paper, we propose the genetic DCNN designer, an autonomous learning algorithm can generate a DCNN architecture automatically based on the data available for a specific image classification problem. We first partition a DCNN into multiple stacked meta convolutional blocks and fully connected blocks, each containing the operations of convolution, pooling, fully connection, batch normalization, activation and drop out, and thus convert the architecture into an integer vector. Then, we use refined evolutionary operations, including selection, mutation and crossover to evolve a population of DCNN architectures. Our results on the MNIST, Fashion-MNIST, EMNISTDigit, EMNIST-Letter, CIFAR10 and CIFAR100 datasets suggest that the proposed genetic DCNN designer is able to produce automatically DCNN architectures, whose performance is comparable to, if not better than, that of stateof- the-art DCNN models
Tasks	Image Classification
Published	2018-07-01
URL	http://arxiv.org/abs/1807.00284v1
PDF	http://arxiv.org/pdf/1807.00284v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-deep-learning-a-genetic-dcnn
Repo
Framework

Optimal Noise-Adding Mechanism in Additive Differential Privacy


Title	Optimal Noise-Adding Mechanism in Additive Differential Privacy
Authors	Quan Geng, Wei Ding, Ruiqi Guo, Sanjiv Kumar
Abstract	We derive the optimal $(0, \delta)$-differentially private query-output independent noise-adding mechanism for single real-valued query function under a general cost-minimization framework. Under a mild technical condition, we show that the optimal noise probability distribution is a uniform distribution with a probability mass at the origin. We explicitly derive the optimal noise distribution for general $\ell^p$ cost functions, including $\ell^1$ (for noise magnitude) and $\ell^2$ (for noise power) cost functions, and show that the probability concentration on the origin occurs when $\delta > \frac{p}{p+1}$. Our result demonstrates an improvement over the existing Gaussian mechanisms by a factor of two and three for $(0,\delta)$-differential privacy in the high privacy regime in the context of minimizing the noise magnitude and noise power, and the gain is more pronounced in the low privacy regime. Our result is consistent with the existing result for $(0,\delta)$-differential privacy in the discrete setting, and identifies a probability concentration phenomenon in the continuous setting.
Tasks
Published	2018-09-26
URL	http://arxiv.org/abs/1809.10224v2
PDF	http://arxiv.org/pdf/1809.10224v2.pdf
PWC	https://paperswithcode.com/paper/optimal-noise-adding-mechanism-in-additive
Repo
Framework

Guided Image Inpainting: Replacing an Image Region by Pulling Content from Another Image


Title	Guided Image Inpainting: Replacing an Image Region by Pulling Content from Another Image
Authors	Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari
Abstract	Deep generative models have shown success in automatically synthesizing missing image regions using surrounding context. However, users cannot directly decide what content to synthesize with such approaches. We propose an end-to-end network for image inpainting that uses a different image to guide the synthesis of new content to fill the hole. A key challenge addressed by our approach is synthesizing new content in regions where the guidance image and the context of the original image are inconsistent. We conduct four studies that demonstrate our results yield more realistic image inpainting results over seven baselines.
Tasks	Image Inpainting
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08435v1
PDF	http://arxiv.org/pdf/1803.08435v1.pdf
PWC	https://paperswithcode.com/paper/guided-image-inpainting-replacing-an-image
Repo
Framework

Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach


Title	Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach
Authors	Muhammad H. Hilman, Maria A. Rodriguez, Rajkumar Buyya
Abstract	Many algorithms in workflow scheduling and resource provisioning rely on the performance estimation of tasks to produce a scheduling plan. A profiler that is capable of modeling the execution of tasks and predicting their runtime accurately, therefore, becomes an essential part of any Workflow Management System (WMS). With the emergence of multi-tenant Workflow as a Service (WaaS) platforms that use clouds for deploying scientific workflows, task runtime prediction becomes more challenging because it requires the processing of a significant amount of data in a near real-time scenario while dealing with the performance variability of cloud resources. Hence, relying on methods such as profiling tasks’ execution data using basic statistical description (e.g., mean, standard deviation) or batch offline regression techniques to estimate the runtime may not be suitable for such environments. In this paper, we propose an online incremental learning approach to predict the runtime of tasks in scientific workflows in clouds. To improve the performance of the predictions, we harness fine-grained resources monitoring data in the form of time-series records of CPU utilization, memory usage, and I/O activities that are reflecting the unique characteristics of a task’s execution. We compare our solution to a state-of-the-art approach that exploits the resources monitoring data based on regression machine learning technique. From our experiments, the proposed strategy improves the performance, in terms of the error, up to 29.89%, compared to the state-of-the-art solutions.
Tasks	Time Series
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04329v1
PDF	http://arxiv.org/pdf/1810.04329v1.pdf
PWC	https://paperswithcode.com/paper/task-runtime-prediction-in-scientific
Repo
Framework

Lazy-CFR: fast and near optimal regret minimization for extensive games with imperfect information


Title	Lazy-CFR: fast and near optimal regret minimization for extensive games with imperfect information
Authors	Yichi Zhou, Tongzheng Ren, Jialian Li, Dong Yan, Jun Zhu
Abstract	Counterfactual regret minimization (CFR) is the most popular algorithm on solving two-player zero-sum extensive games with imperfect information and achieves state-of-the-art performance in practice. However, the performance of CFR is not fully understood, since empirical results on the regret are much better than the upper bound proved in \cite{zinkevich2008regret}. Another issue is that CFR has to traverse the whole game tree in each round, which is time-consuming in large scale games. In this paper, we present a novel technique, lazy update, which can avoid traversing the whole game tree in CFR, as well as a novel analysis on the regret of CFR with lazy update. Our analysis can also be applied to the vanilla CFR, resulting in a much tighter regret bound than that in \cite{zinkevich2008regret}. Inspired by lazy update, we further present a novel CFR variant, named Lazy-CFR. Compared to traversing $O(\mathcal{I})$ information sets in vanilla CFR, Lazy-CFR needs only to traverse $O(\sqrt{\mathcal{I}})$ information sets per round while keeping the regret bound almost the same, where $\mathcal{I}$ is the class of all information sets. As a result, Lazy-CFR shows better convergence result compared with vanilla CFR. Experimental results consistently show that Lazy-CFR outperforms the vanilla CFR significantly.
Tasks
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04433v3
PDF	http://arxiv.org/pdf/1810.04433v3.pdf
PWC	https://paperswithcode.com/paper/lazy-cfr-fast-and-near-optimal-regret
Repo
Framework

Dropout Regularization in Hierarchical Mixture of Experts


Title	Dropout Regularization in Hierarchical Mixture of Experts
Authors	Ozan İrsoy, Ethem Alpaydın
Abstract	Dropout is a very effective method in preventing overfitting and has become the go-to regularizer for multi-layer neural networks in recent years. Hierarchical mixture of experts is a hierarchically gated model that defines a soft decision tree where leaves correspond to experts and decision nodes correspond to gating models that softly choose between its children, and as such, the model defines a soft hierarchical partitioning of the input space. In this work, we propose a variant of dropout for hierarchical mixture of experts that is faithful to the tree hierarchy defined by the model, as opposed to having a flat, unitwise independent application of dropout as one has with multi-layer perceptrons. We show that on a synthetic regression data and on MNIST and CIFAR-10 datasets, our proposed dropout mechanism prevents overfitting on trees with many levels improving generalization and providing smoother fits.
Tasks
Published	2018-12-25
URL	http://arxiv.org/abs/1812.10158v1
PDF	http://arxiv.org/pdf/1812.10158v1.pdf
PWC	https://paperswithcode.com/paper/dropout-regularization-in-hierarchical
Repo
Framework