Paper Group ANR 372
Discovering General-Purpose Active Learning Strategies. Assessing fish abundance from underwater video using deep neural networks. Biologically-plausible learning algorithms can scale to large datasets. Automatic Chord Recognition with Higher-Order Harmonic Language Modelling. Cross-domain Human Parsing via Adversarial Feature and Label Adaptation. …
Discovering General-Purpose Active Learning Strategies
Title | Discovering General-Purpose Active Learning Strategies |
Authors | Ksenia Konyushkova, Raphael Sznitman, Pascal Fua |
Abstract | We propose a general-purpose approach to discovering active learning (AL) strategies from data. These strategies are transferable from one domain to another and can be used in conjunction with many machine learning models. To this end, we formalize the annotation process as a Markov decision process, design universal state and action spaces and introduce a new reward function that precisely model the AL objective of minimizing the annotation cost. We seek to find an optimal (non-myopic) AL strategy using reinforcement learning. We evaluate the learned strategies on multiple unrelated domains and show that they consistently outperform state-of-the-art baselines. |
Tasks | Active Learning |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.04114v2 |
http://arxiv.org/pdf/1810.04114v2.pdf | |
PWC | https://paperswithcode.com/paper/discovering-general-purpose-active-learning |
Repo | |
Framework | |
Assessing fish abundance from underwater video using deep neural networks
Title | Assessing fish abundance from underwater video using deep neural networks |
Authors | Ranju Mandal, Rod M. Connolly, Thomas A. Schlacherz, Bela Stantic |
Abstract | Uses of underwater videos to assess diversity and abundance of fish are being rapidly adopted by marine biologists. Manual processing of videos for quantification by human analysts is time and labour intensive. Automatic processing of videos can be employed to achieve the objectives in a cost and time-efficient way. The aim is to build an accurate and reliable fish detection and recognition system, which is important for an autonomous robotic platform. However, there are many challenges involved in this task (e.g. complex background, deformation, low resolution and light propagation). Recent advancement in the deep neural network has led to the development of object detection and recognition in real time scenarios. An end-to-end deep learning-based architecture is introduced which outperformed the state of the art methods and first of its kind on fish assessment task. A Region Proposal Network (RPN) introduced by an object detector termed as Faster R-CNN was combined with three classification networks for detection and recognition of fish species obtained from Remote Underwater Video Stations (RUVS). An accuracy of 82.4% (mAP) obtained from the experiments are much higher than previously proposed methods. |
Tasks | Fish Detection, Object Detection |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05838v1 |
http://arxiv.org/pdf/1807.05838v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-fish-abundance-from-underwater |
Repo | |
Framework | |
Biologically-plausible learning algorithms can scale to large datasets
Title | Biologically-plausible learning algorithms can scale to large datasets |
Authors | Will Xiao, Honglin Chen, Qianli Liao, Tomaso Poggio |
Abstract | The backpropagation (BP) algorithm is often thought to be biologically implausible in the brain. One of the main reasons is that BP requires symmetric weight matrices in the feedforward and feedback pathways. To address this “weight transport problem” (Grossberg, 1987), two more biologically plausible algorithms, proposed by Liao et al. (2016) and Lillicrap et al. (2016), relax BP’s weight symmetry requirements and demonstrate comparable learning capabilities to that of BP on small datasets. However, a recent study by Bartunov et al. (2018) evaluate variants of target-propagation (TP) and feedback alignment (FA) on MINIST, CIFAR, and ImageNet datasets, and find that although many of the proposed algorithms perform well on MNIST and CIFAR, they perform significantly worse than BP on ImageNet. Here, we additionally evaluate the sign-symmetry algorithm (Liao et al., 2016), which differs from both BP and FA in that the feedback and feedforward weights share signs but not magnitudes. We examine the performance of sign-symmetry and feedback alignment on ImageNet and MS COCO datasets using different network architectures (ResNet-18 and AlexNet for ImageNet, RetinaNet for MS COCO). Surprisingly, networks trained with sign-symmetry can attain classification performance approaching that of BP-trained networks. These results complement the study by Bartunov et al. (2018), and establish a new benchmark for future biologically plausible learning algorithms on more difficult datasets and more complex architectures. |
Tasks | |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03567v3 |
http://arxiv.org/pdf/1811.03567v3.pdf | |
PWC | https://paperswithcode.com/paper/biologically-plausible-learning-algorithms |
Repo | |
Framework | |
Automatic Chord Recognition with Higher-Order Harmonic Language Modelling
Title | Automatic Chord Recognition with Higher-Order Harmonic Language Modelling |
Authors | Filip Korzeniowski, Gerhard Widmer |
Abstract | Common temporal models for automatic chord recognition model chord changes on a frame-wise basis. Due to this fact, they are unable to capture musical knowledge about chord progressions. In this paper, we propose a temporal model that enables explicit modelling of chord changes and durations. We then apply N-gram models and a neural-network-based acoustic model within this framework, and evaluate the effect of model overconfidence. Our results show that model overconfidence plays only a minor role (but target smoothing still improves the acoustic model), and that stronger chord language models do improve recognition results, however their effects are small compared to other domains. |
Tasks | Chord Recognition, Language Modelling |
Published | 2018-08-16 |
URL | http://arxiv.org/abs/1808.05341v1 |
http://arxiv.org/pdf/1808.05341v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-chord-recognition-with-higher-order |
Repo | |
Framework | |
Cross-domain Human Parsing via Adversarial Feature and Label Adaptation
Title | Cross-domain Human Parsing via Adversarial Feature and Label Adaptation |
Authors | Si Liu, Yao Sun, Defa Zhu, Guanghui Ren, Yu Chen, Jiashi Feng, Jizhong Han |
Abstract | Human parsing has been extensively studied recently due to its wide applications in many important scenarios. Mainstream fashion parsing models focus on parsing the high-resolution and clean images. However, directly applying the parsers trained on benchmarks to a particular application scenario in the wild, e.g., a canteen, airport or workplace, often gives non-satisfactory performance due to domain shift. In this paper, we explore a new and challenging cross-domain human parsing problem: taking the benchmark dataset with extensive pixel-wise labeling as the source domain, how to obtain a satisfactory parser on a new target domain without requiring any additional manual labeling? To this end, we propose a novel and efficient cross-domain human parsing model to bridge the cross-domain differences in terms of visual appearance and environment conditions and fully exploit commonalities across domains. Our proposed model explicitly learns a feature compensation network, which is specialized for mitigating the cross-domain differences. A discriminative feature adversarial network is introduced to supervise the feature compensation to effectively reduce the discrepancy between feature distributions of two domains. Besides, our model also introduces a structured label adversarial network to guide the parsing results of the target domain to follow the high-order relationships of the structured labels shared across domains. The proposed framework is end-to-end trainable, practical and scalable in real applications. Extensive experiments are conducted where LIP dataset is the source domain and 4 different datasets including surveillance videos, movies and runway shows are evaluated as target domains. The results consistently confirm data efficiency and performance advantages of the proposed method for the cross-domain human parsing problem. |
Tasks | Human Parsing |
Published | 2018-01-04 |
URL | http://arxiv.org/abs/1801.01260v2 |
http://arxiv.org/pdf/1801.01260v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-human-parsing-via-adversarial |
Repo | |
Framework | |
Nonnegative PARAFAC2: a flexible coupling approach
Title | Nonnegative PARAFAC2: a flexible coupling approach |
Authors | Jeremy E. Cohen, Rasmus Bro |
Abstract | Modeling variability in tensor decomposition methods is one of the challenges of source separation. One possible solution to account for variations from one data set to another, jointly analysed, is to resort to the PARAFAC2 model. However, so far imposing constraints on the mode with variability has not been possible. In the following manuscript, a relaxation of the PARAFAC2 model is introduced, that allows for imposing nonnegativity constraints on the varying mode. An algorithm to compute the proposed flexible PARAFAC2 model is derived, and its performance is studied on both synthetic and chemometrics data. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.05035v1 |
http://arxiv.org/pdf/1802.05035v1.pdf | |
PWC | https://paperswithcode.com/paper/nonnegative-parafac2-a-flexible-coupling |
Repo | |
Framework | |
Handwritten Digit Recognition by Elastic Matching
Title | Handwritten Digit Recognition by Elastic Matching |
Authors | Sagnik Majumder, C. von der Malsburg, Aashish Richhariya, Surekha Bhanot |
Abstract | A simple model of MNIST handwritten digit recognition is presented here. The model is an adaptation of a previous theory of face recognition. It realizes translation and rotation invariance in a principled way instead of being based on extensive learning from large masses of sample data. The presented recognition rates fall short of other publications, but due to its inspectability and conceptual and numerical simplicity, our system commends itself as a basis for further development. |
Tasks | Face Recognition, Handwritten Digit Recognition |
Published | 2018-07-24 |
URL | http://arxiv.org/abs/1807.09324v1 |
http://arxiv.org/pdf/1807.09324v1.pdf | |
PWC | https://paperswithcode.com/paper/handwritten-digit-recognition-by-elastic |
Repo | |
Framework | |
2^B3^C: 2 Box 3 Crop of Facial Image for Gender Classification with Convolutional Networks
Title | 2^B3^C: 2 Box 3 Crop of Facial Image for Gender Classification with Convolutional Networks |
Authors | Vandit Gajjar |
Abstract | In this paper, we tackle the classification of gender in facial images with deep learning. Our convolutional neural networks (CNN) use the VGG-16 architecture [1] and are pretrained on ImageNet for image classification. Our proposed method (2^B3^C) first detects the face in the facial image, increases the margin of a detected face by 50%, cropping the face with two boxes three crop schemes (Left, Middle, and Right crop) and extracts the CNN predictions on the cropped schemes. The CNNs of our method is fine-tuned on the Adience and LFW with gender annotations. We show the effectiveness of our method by achieving 90.8% classification on Adience and achieving competitive 95.3% classification accuracy on LFW dataset. In addition, to check the true ability of our method, our gender classification system has a frame rate of 7-10 fps (frames per seconds) on a GPU considering real-time scenarios. |
Tasks | Image Classification |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.02181v1 |
http://arxiv.org/pdf/1803.02181v1.pdf | |
PWC | https://paperswithcode.com/paper/2b3c-2-box-3-crop-of-facial-image-for-gender |
Repo | |
Framework | |
UAlacant machine translation quality estimation at WMT 2018: a simple approach using phrase tables and feed-forward neural networks
Title | UAlacant machine translation quality estimation at WMT 2018: a simple approach using phrase tables and feed-forward neural networks |
Authors | Miquel Esplà-Gomis, Felipe Sánchez-Martínez, Mikel L. Forcada |
Abstract | We describe the Universitat d’Alacant submissions to the word- and sentence-level machine translation (MT) quality estimation (QE) shared task at WMT 2018. Our approach to word-level MT QE builds on previous work to mark the words in the machine-translated sentence as \textit{OK} or \textit{BAD}, and is extended to determine if a word or sequence of words need to be inserted in the gap after each word. Our sentence-level submission simply uses the edit operations predicted by the word-level approach to approximate TER. The method presented ranked first in the sub-task of identifying insertions in gaps for three out of the six datasets, and second in the rest of them. |
Tasks | Machine Translation |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02510v1 |
http://arxiv.org/pdf/1811.02510v1.pdf | |
PWC | https://paperswithcode.com/paper/ualacant-machine-translation-quality |
Repo | |
Framework | |
Autonomous Deep Learning: A Genetic DCNN Designer for Image Classification
Title | Autonomous Deep Learning: A Genetic DCNN Designer for Image Classification |
Authors | Benteng Ma, Yong Xia |
Abstract | Recent years have witnessed the breakthrough success of deep convolutional neural networks (DCNNs) in image classification and other vision applications. Although freeing users from the troublesome handcrafted feature extraction by providing a uniform feature extraction-classification framework, DCNNs still require a handcrafted design of their architectures. In this paper, we propose the genetic DCNN designer, an autonomous learning algorithm can generate a DCNN architecture automatically based on the data available for a specific image classification problem. We first partition a DCNN into multiple stacked meta convolutional blocks and fully connected blocks, each containing the operations of convolution, pooling, fully connection, batch normalization, activation and drop out, and thus convert the architecture into an integer vector. Then, we use refined evolutionary operations, including selection, mutation and crossover to evolve a population of DCNN architectures. Our results on the MNIST, Fashion-MNIST, EMNISTDigit, EMNIST-Letter, CIFAR10 and CIFAR100 datasets suggest that the proposed genetic DCNN designer is able to produce automatically DCNN architectures, whose performance is comparable to, if not better than, that of stateof- the-art DCNN models |
Tasks | Image Classification |
Published | 2018-07-01 |
URL | http://arxiv.org/abs/1807.00284v1 |
http://arxiv.org/pdf/1807.00284v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-deep-learning-a-genetic-dcnn |
Repo | |
Framework | |
Optimal Noise-Adding Mechanism in Additive Differential Privacy
Title | Optimal Noise-Adding Mechanism in Additive Differential Privacy |
Authors | Quan Geng, Wei Ding, Ruiqi Guo, Sanjiv Kumar |
Abstract | We derive the optimal $(0, \delta)$-differentially private query-output independent noise-adding mechanism for single real-valued query function under a general cost-minimization framework. Under a mild technical condition, we show that the optimal noise probability distribution is a uniform distribution with a probability mass at the origin. We explicitly derive the optimal noise distribution for general $\ell^p$ cost functions, including $\ell^1$ (for noise magnitude) and $\ell^2$ (for noise power) cost functions, and show that the probability concentration on the origin occurs when $\delta > \frac{p}{p+1}$. Our result demonstrates an improvement over the existing Gaussian mechanisms by a factor of two and three for $(0,\delta)$-differential privacy in the high privacy regime in the context of minimizing the noise magnitude and noise power, and the gain is more pronounced in the low privacy regime. Our result is consistent with the existing result for $(0,\delta)$-differential privacy in the discrete setting, and identifies a probability concentration phenomenon in the continuous setting. |
Tasks | |
Published | 2018-09-26 |
URL | http://arxiv.org/abs/1809.10224v2 |
http://arxiv.org/pdf/1809.10224v2.pdf | |
PWC | https://paperswithcode.com/paper/optimal-noise-adding-mechanism-in-additive |
Repo | |
Framework | |
Guided Image Inpainting: Replacing an Image Region by Pulling Content from Another Image
Title | Guided Image Inpainting: Replacing an Image Region by Pulling Content from Another Image |
Authors | Yinan Zhao, Brian Price, Scott Cohen, Danna Gurari |
Abstract | Deep generative models have shown success in automatically synthesizing missing image regions using surrounding context. However, users cannot directly decide what content to synthesize with such approaches. We propose an end-to-end network for image inpainting that uses a different image to guide the synthesis of new content to fill the hole. A key challenge addressed by our approach is synthesizing new content in regions where the guidance image and the context of the original image are inconsistent. We conduct four studies that demonstrate our results yield more realistic image inpainting results over seven baselines. |
Tasks | Image Inpainting |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08435v1 |
http://arxiv.org/pdf/1803.08435v1.pdf | |
PWC | https://paperswithcode.com/paper/guided-image-inpainting-replacing-an-image |
Repo | |
Framework | |
Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach
Title | Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach |
Authors | Muhammad H. Hilman, Maria A. Rodriguez, Rajkumar Buyya |
Abstract | Many algorithms in workflow scheduling and resource provisioning rely on the performance estimation of tasks to produce a scheduling plan. A profiler that is capable of modeling the execution of tasks and predicting their runtime accurately, therefore, becomes an essential part of any Workflow Management System (WMS). With the emergence of multi-tenant Workflow as a Service (WaaS) platforms that use clouds for deploying scientific workflows, task runtime prediction becomes more challenging because it requires the processing of a significant amount of data in a near real-time scenario while dealing with the performance variability of cloud resources. Hence, relying on methods such as profiling tasks’ execution data using basic statistical description (e.g., mean, standard deviation) or batch offline regression techniques to estimate the runtime may not be suitable for such environments. In this paper, we propose an online incremental learning approach to predict the runtime of tasks in scientific workflows in clouds. To improve the performance of the predictions, we harness fine-grained resources monitoring data in the form of time-series records of CPU utilization, memory usage, and I/O activities that are reflecting the unique characteristics of a task’s execution. We compare our solution to a state-of-the-art approach that exploits the resources monitoring data based on regression machine learning technique. From our experiments, the proposed strategy improves the performance, in terms of the error, up to 29.89%, compared to the state-of-the-art solutions. |
Tasks | Time Series |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04329v1 |
http://arxiv.org/pdf/1810.04329v1.pdf | |
PWC | https://paperswithcode.com/paper/task-runtime-prediction-in-scientific |
Repo | |
Framework | |
Lazy-CFR: fast and near optimal regret minimization for extensive games with imperfect information
Title | Lazy-CFR: fast and near optimal regret minimization for extensive games with imperfect information |
Authors | Yichi Zhou, Tongzheng Ren, Jialian Li, Dong Yan, Jun Zhu |
Abstract | Counterfactual regret minimization (CFR) is the most popular algorithm on solving two-player zero-sum extensive games with imperfect information and achieves state-of-the-art performance in practice. However, the performance of CFR is not fully understood, since empirical results on the regret are much better than the upper bound proved in \cite{zinkevich2008regret}. Another issue is that CFR has to traverse the whole game tree in each round, which is time-consuming in large scale games. In this paper, we present a novel technique, lazy update, which can avoid traversing the whole game tree in CFR, as well as a novel analysis on the regret of CFR with lazy update. Our analysis can also be applied to the vanilla CFR, resulting in a much tighter regret bound than that in \cite{zinkevich2008regret}. Inspired by lazy update, we further present a novel CFR variant, named Lazy-CFR. Compared to traversing $O(\mathcal{I})$ information sets in vanilla CFR, Lazy-CFR needs only to traverse $O(\sqrt{\mathcal{I}})$ information sets per round while keeping the regret bound almost the same, where $\mathcal{I}$ is the class of all information sets. As a result, Lazy-CFR shows better convergence result compared with vanilla CFR. Experimental results consistently show that Lazy-CFR outperforms the vanilla CFR significantly. |
Tasks | |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04433v3 |
http://arxiv.org/pdf/1810.04433v3.pdf | |
PWC | https://paperswithcode.com/paper/lazy-cfr-fast-and-near-optimal-regret |
Repo | |
Framework | |
Dropout Regularization in Hierarchical Mixture of Experts
Title | Dropout Regularization in Hierarchical Mixture of Experts |
Authors | Ozan İrsoy, Ethem Alpaydın |
Abstract | Dropout is a very effective method in preventing overfitting and has become the go-to regularizer for multi-layer neural networks in recent years. Hierarchical mixture of experts is a hierarchically gated model that defines a soft decision tree where leaves correspond to experts and decision nodes correspond to gating models that softly choose between its children, and as such, the model defines a soft hierarchical partitioning of the input space. In this work, we propose a variant of dropout for hierarchical mixture of experts that is faithful to the tree hierarchy defined by the model, as opposed to having a flat, unitwise independent application of dropout as one has with multi-layer perceptrons. We show that on a synthetic regression data and on MNIST and CIFAR-10 datasets, our proposed dropout mechanism prevents overfitting on trees with many levels improving generalization and providing smoother fits. |
Tasks | |
Published | 2018-12-25 |
URL | http://arxiv.org/abs/1812.10158v1 |
http://arxiv.org/pdf/1812.10158v1.pdf | |
PWC | https://paperswithcode.com/paper/dropout-regularization-in-hierarchical |
Repo | |
Framework | |