January 28, 2020

3055 words 15 mins read

Paper Group ANR 969

Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator. k-Relevance Vectors for Pattern Classification. Cross-lingual Visual Verb Sense Disambiguation. AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design. Image Captioning with Sparse Recurrent Neural Networ …

Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator


Title	Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator
Authors	James Bocinsky, Connor McCurley, Daniel Shats, Alina Zare
Abstract	Sensors which use electromagnetic induction (EMI) to excite a response in conducting bodies have long been investigated for subsurface explosive hazard detection. In particular, EMI sensors have been used to discriminate between different types of objects, and to detect objects with low metal content. One successful, previously investigated approach is the Multiple Instance Adaptive Cosine Estimator (MI-ACE). In this paper, a number of new initialization techniques for MI-ACE are proposed and evaluated using their respective performance and speed. The cross validated learned signatures, as well as learned background statistics, are used with Adaptive Cosine Estimator (ACE) to generate confidence maps, which are clustered into alarms. Alarms are scored against a ground truth and the initialization approaches are compared.
Tasks
Published	2019-04-30
URL	http://arxiv.org/abs/1904.13197v1
PDF	http://arxiv.org/pdf/1904.13197v1.pdf
PWC	https://paperswithcode.com/paper/investigation-of-initialization-strategies
Repo
Framework

k-Relevance Vectors for Pattern Classification


Title	k-Relevance Vectors for Pattern Classification
Authors	Peyman Hosseinzadeh Kassani, Sara Hosseinzadeh Kassani
Abstract	This study combines two different learning paradigms, k-nearest neighbor (k-NN) rule, as memory-based learning paradigm and relevance vector machines (RVM), as statistical learning paradigm. This combination is performed in kernel space and is called k-relevance vector (k-RV). The purpose is to improve the performance of k-NN rule. The proposed model significantly prunes irrelevant attributes. We also introduced a new parameter, responsible for early stopping of iterations in RVM. We show that the new parameter improves the classification accuracy of k-RV. Intensive experiments are conducted on several classification datasets from University of California Irvine (UCI) repository and two real datasets from computer vision domain. The performance of k-RV is highly competitive compared to a few state-of-the-arts in terms of classification accuracy.
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08528v1
PDF	https://arxiv.org/pdf/1909.08528v1.pdf
PWC	https://paperswithcode.com/paper/k-relevance-vectors-for-pattern
Repo
Framework

Cross-lingual Visual Verb Sense Disambiguation


Title	Cross-lingual Visual Verb Sense Disambiguation
Authors	Spandana Gella, Desmond Elliott, Frank Keller
Abstract	Recent work has shown that visual context improves cross-lingual sense disambiguation for nouns. We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9,504 images annotated with English, German, and Spanish verbs. Each image in MultiSense is annotated with an English verb and its translation in German or Spanish. We show that cross-lingual verb sense disambiguation models benefit from visual context, compared to unimodal baselines. We also show that the verb sense predicted by our best disambiguation model can improve the results of a text-only machine translation system when used for a multimodal translation task.
Tasks	Machine Translation
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05092v2
PDF	http://arxiv.org/pdf/1904.05092v2.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-visual-verb-sense
Repo
Framework

AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design


Title	AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design
Authors	Alexander Wong, Zhong Qiu Lin, Brendan Chwyl
Abstract	While deep neural networks have achieved state-of-the-art performance across a large number of complex tasks, it remains a big challenge to deploy such networks for practical, on-device edge scenarios such as on mobile devices, consumer devices, drones, and vehicles. In this study, we take a deeper exploration into a human-machine collaborative design approach for creating highly efficient deep neural networks through a synergy between principled network design prototyping and machine-driven design exploration. The efficacy of human-machine collaborative design is demonstrated through the creation of AttoNets, a family of highly efficient deep neural networks for on-device edge deep learning. Each AttoNet possesses a human-specified network-level macro-architecture comprising of custom modules with unique machine-designed module-level macro-architecture and micro-architecture designs, all driven by human-specified design requirements. Experimental results for the task of object recognition showed that the AttoNets created via human-machine collaborative design has significantly fewer parameters and computational costs than state-of-the-art networks designed for efficiency while achieving noticeably higher accuracy (with the smallest AttoNet achieving ~1.8% higher accuracy while requiring ~10x fewer multiply-add operations and parameters than MobileNet-V1). Furthermore, the efficacy of the AttoNets is demonstrated for the task of instance-level object segmentation and object detection, where an AttoNet-based Mask R-CNN network was constructed with significantly fewer parameters and computational costs (~5x fewer multiply-add operations and ~2x fewer parameters) than a ResNet-50 based Mask R-CNN network.
Tasks	Object Detection, Object Recognition, Semantic Segmentation
Published	2019-03-18
URL	http://arxiv.org/abs/1903.07209v2
PDF	http://arxiv.org/pdf/1903.07209v2.pdf
PWC	https://paperswithcode.com/paper/attonets-compact-and-efficient-deep-neural
Repo
Framework

Image Captioning with Sparse Recurrent Neural Network


Title	Image Captioning with Sparse Recurrent Neural Network
Authors	Jia Huei Tan, Chee Seng Chan, Joon Huang Chuah
Abstract	Recurrent Neural Network (RNN) has been widely used to tackle a wide variety of language generation problems and are capable of attaining state-of-the-art (SOTA) performance. However despite its impressive results, the large number of parameters in the RNN model makes deployment to mobile and embedded devices infeasible. Driven by this problem, many works have proposed a number of pruning methods to reduce the sizes of the RNN model. In this work, we propose an end-to-end pruning method for image captioning models equipped with visual attention. Our proposed method is able to achieve sparsity levels up to 97.5% without significant performance loss relative to the baseline (~ 2% loss at 40x compression after fine-tuning). Our method is also simple to use and tune, facilitating faster development times for neural network practitioners. We perform extensive experiments on the popular MS-COCO dataset in order to empirically validate the efficacy of our proposed method.
Tasks	Image Captioning, Text Generation
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10797v2
PDF	https://arxiv.org/pdf/1908.10797v2.pdf
PWC	https://paperswithcode.com/paper/image-captioning-with-sparse-recurrent-neural
Repo
Framework

Optimizing Controller Placement for Software-Defined Networks


Title	Optimizing Controller Placement for Software-Defined Networks
Authors	Victoria Huang, Gang Chen, Qiang Fu, Elliott Wen
Abstract	Controller placement problem (CPP) is a key issue for Software-Defined Networking (SDN) with distributed controller architectures. This problem aims to determine a suitable number of controllers deployed in important locations so as to optimize the overall network performance. In comparison to communication delay, existing literature on the CPP assumes that the influence of controller workload distribution on network performance is negligible. In this paper, we tackle the CPP that simultaneously considers the communication delay, the control plane utilization, and the controller workload distribution. Due to this reason, our CPP is intrinsically different from and clearly more difficult than any previously studied CPPs that are NP-hard. To tackle this challenging issue, we develop a new algorithm that seamlessly integrates the genetic algorithm (GA) and the gradient descent (GD) optimization method. Particularly, GA is used to search for suitable CPP solutions. The quality of each solution is further evaluated through GD. Simulation results on two representative network scenarios (small-scale and large-scale) show that our algorithm can effectively strike the trade-off between the control plane utilization and the network response time.
Tasks
Published	2019-02-14
URL	http://arxiv.org/abs/1902.09451v1
PDF	http://arxiv.org/pdf/1902.09451v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-controller-placement-for-software
Repo
Framework

Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise


Title	Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise
Authors	Guillaume Carbajal, Romain Serizel, Emmanuel Vincent, Eric Humbert
Abstract	We consider the problem of simultaneous reduction of acoustic echo, reverberation and noise. In real scenarios, these distortion sources may occur simultaneously and reducing them implies combining the corresponding distortion-specific filters. As these filters interact with each other, they must be jointly optimized. We propose to model the target and residual signals after linear echo cancellation and dereverberation using a multichannel Gaussian modeling framework and to jointly represent their spectra by means of a neural network. We develop an iterative block-coordinate ascent algorithm to update all the filters. We evaluate our system on real recordings of acoustic echo, reverberation and noise acquired with a smart speaker in various situations. The proposed approach outperforms in terms of overall distortion a cascade of the individual approaches and a joint reduction approach which does not rely on a spectral model of the target and residual signals.
Tasks
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08934v2
PDF	https://arxiv.org/pdf/1911.08934v2.pdf
PWC	https://paperswithcode.com/paper/joint-dnn-based-multichannel-reduction-of
Repo
Framework

Subword-Level Language Identification for Intra-Word Code-Switching


Title	Subword-Level Language Identification for Intra-Word Code-Switching
Authors	Manuel Mager, Özlem Çetinoğlu, Katharina Kann
Abstract	Language identification for code-switching (CS), the phenomenon of alternating between two or more languages in conversations, has traditionally been approached under the assumption of a single language per token. However, if at least one language is morphologically rich, a large number of words can be composed of morphemes from more than one language (intra-word CS). In this paper, we extend the language identification task to the subword-level, such that it includes splitting mixed words while tagging each part with a language ID. We further propose a model for this task, which is based on a segmental recurrent neural network. In experiments on a new Spanish–Wixarika dataset and on an adapted German–Turkish dataset, our proposed model performs slightly better than or roughly on par with our best baseline, respectively. Considering only mixed words, however, it strongly outperforms all baselines.
Tasks	Language Identification
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01989v1
PDF	http://arxiv.org/pdf/1904.01989v1.pdf
PWC	https://paperswithcode.com/paper/subword-level-language-identification-for
Repo
Framework

Does Data Augmentation Lead to Positive Margin?


Title	Does Data Augmentation Lead to Positive Margin?
Authors	Shashank Rajput, Zhili Feng, Zachary Charles, Po-Ling Loh, Dimitris Papailiopoulos
Abstract	Data augmentation (DA) is commonly used during model training, as it significantly improves test error and model robustness. DA artificially expands the training set by applying random noise, rotations, crops, or even adversarial perturbations to the input data. Although DA is widely used, its capacity to provably improve robustness is not fully understood. In this work, we analyze the robustness that DA begets by quantifying the margin that DA enforces on empirical risk minimizers. We first focus on linear separators, and then a class of nonlinear models whose labeling is constant within small convex hulls of data points. We present lower bounds on the number of augmented data points required for non-zero margin, and show that commonly used DA techniques may only introduce significant margin after adding exponentially many points to the data set.
Tasks	Data Augmentation
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03177v1
PDF	https://arxiv.org/pdf/1905.03177v1.pdf
PWC	https://paperswithcode.com/paper/does-data-augmentation-lead-to-positive
Repo
Framework

Exponential Slowdown for Larger Populations: The $(μ+1)$-EA on Monotone Functions


Title	Exponential Slowdown for Larger Populations: The $(μ+1)$-EA on Monotone Functions
Authors	Johannes Lengler, Xun Zou
Abstract	Pseudo-Boolean monotone functions are unimodal functions which are trivial to optimize for some hillclimbers, but are challenging for a surprising number of evolutionary algorithms (EAs). A general trend is that EAs are efficient if parameters like the mutation rate are set conservatively, but may need exponential time otherwise. In particular, it was known that the $(1+1)$-EA and the $(1+\lambda)$-EA can optimize every monotone function in pseudolinear time if the mutation rate is $c/n$ for some $c<1$, but they need exponential time for some monotone functions for $c>2.2$. The second part of the statement was also known for the $(\mu+1)$-EA. In this paper we show that the first statement does not apply to the $(\mu+1)$-EA. More precisely, we prove that for every constant $c>0$ there is a constant integer $\mu_0$ such that the $(\mu+1)$-EA with mutation rate $c/n$ and population size $\mu_0\le\mu\le n$ needs superpolynomial time to optimize some monotone functions. Thus, increasing the population size by just a constant has devastating effects on the performance. This is in stark contrast to many other benchmark functions on which increasing the population size either increases the performance significantly, or affects performance mildly. The reason why larger populations are harmful lies in the fact that larger populations may temporarily decrease selective pressure on parts of the population. This allows unfavorable mutations to accumulate in single individuals and their descendants. If the population moves sufficiently fast through the search space, such unfavorable descendants can become ancestors of future generations, and the bad mutations are preserved. Remarkably, this effect only occurs if the population renews itself sufficiently fast, which can only happen far away from the optimum. This is counter-intuitive since usually optimization gets harder as we approach the optimum.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1907.12821v1
PDF	https://arxiv.org/pdf/1907.12821v1.pdf
PWC	https://paperswithcode.com/paper/exponential-slowdown-for-larger-populations
Repo
Framework

Finite-State Extreme Effect Variable


Title	Finite-State Extreme Effect Variable
Authors	Alexey Drutsa
Abstract	We generalize to the finite-state case the notion of the extreme effect variable $Y$ that accumulates all the effect of a variant variable $V$ observed in changes of another variable $X$. We conduct theoretical analysis and turn the problem of finding of an effect variable into a problem of a simultaneous decomposition of a set of distributions. The states of the extreme effect variable, on the one hand, are minimally affected by the variant variable $V$ and, on the other hand, are extremely different with respect to the observable variable $X$. We apply our technique to online evaluation of a web search engine through A/B testing and show its utility.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.13377v1
PDF	https://arxiv.org/pdf/1912.13377v1.pdf
PWC	https://paperswithcode.com/paper/finite-state-extreme-effect-variable
Repo
Framework


Title	Cross-Modal Subspace Learning with Scheduled Adaptive Margin Constraints
Authors	David Semedo, João Magalhães
Abstract	Cross-modal embeddings, between textual and visual modalities, aim to organise multimodal instances by their semantic correlations. State-of-the-art approaches use maximum-margin methods, based on the hinge-loss, to enforce a constant margin m, to separate projections of multimodal instances from different categories. In this paper, we propose a novel scheduled adaptive maximum-margin (SAM) formulation that infers triplet-specific constraints during training, therefore organising instances by adaptively enforcing inter-category and inter-modality correlations. This is supported by a scheduled adaptive margin function, that is smoothly activated, replacing a static margin by an adaptively inferred one reflecting triplet-specific semantic correlations while accounting for the incremental learning behaviour of neural networks to enforce category cluster formation and enforcement. Experiments on widely used datasets show that our model improved upon state-of-the-art approaches, by achieving a relative improvement of up to ~12.5% over the second best method, thus confirming the effectiveness of our scheduled adaptive margin formulation.
Tasks
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13733v1
PDF	https://arxiv.org/pdf/1909.13733v1.pdf
PWC	https://paperswithcode.com/paper/cross-modal-subspace-learning-with-scheduled
Repo
Framework


Title	A Cross-Modal Image Fusion Theory Guided by Human Visual Characteristics
Authors	Aiqing Fang, Xinbo Zhao, Yanning Zhang
Abstract	The characteristics of feature selection, nonlinear combination and multi-task auxiliary learning mechanism of the human visual perception system play an important role in real-world scenarios, but the research of image fusion theory based on the characteristics of human visual perception is less. Inspired by the characteristics of human visual perception, we propose a robust multi-task auxiliary learning optimization image fusion theory. Firstly, we combine channel attention model with nonlinear convolutional neural network to select features and fuse nonlinear features. Then, we analyze the impact of the existing image fusion loss on the image fusion quality, and establish the multi-loss function model of unsupervised learning network. Secondly, aiming at the multi-task auxiliary learning mechanism of human visual perception system, we study the influence of multi-task auxiliary learning mechanism on image fusion task on the basis of single task multi-loss network model. By simulating the three characteristics of human visual perception system, the fused image is more consistent with the mechanism of human brain image fusion. Finally, in order to verify the superiority of our algorithm, we carried out experiments on the combined vision system image data set, and extended our algorithm to the infrared and visible image and the multi-focus image public data set for experimental verification. The experimental results demonstrate the superiority of our fusion theory over state-of-arts in generality and robustness.
Tasks	Auxiliary Learning, Feature Selection
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08577v2
PDF	https://arxiv.org/pdf/1912.08577v2.pdf
PWC	https://paperswithcode.com/paper/a-cross-modal-image-fusion-theory-guided-by
Repo
Framework

Robust Group Synchronization via Cycle-Edge Message Passing


Title	Robust Group Synchronization via Cycle-Edge Message Passing
Authors	Gilad Lerman, Yunpeng Shi
Abstract	We propose a general framework for group synchronization with adversarial corruption and sufficiently small noise. Specifically, we apply a novel message passing procedure that uses cycle consistency information in order to estimate the corruption levels of group ratios and consequently infer the corrupted group ratios and solve the synchronization problem. We first explain why the group cycle consistency information is essential for effectively solving group synchronization problems. We then establish exact recovery and linear convergence guarantees for the proposed message passing procedure under a deterministic setting with adversarial corruption. These guarantees hold as long as the ratio of corrupted cycles per edge is bounded by a reasonable constant. We also establish the stability of the proposed procedure to sub-Gaussian noise. We further show that under a uniform corruption model, the recovery results are sharp in terms of an information-theoretic bound.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11347v1
PDF	https://arxiv.org/pdf/1912.11347v1.pdf
PWC	https://paperswithcode.com/paper/robust-group-synchronization-via-cycle-edge
Repo
Framework

Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data


Title	Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data
Authors	Minjie Wang, Genevera I. Allen
Abstract	In mixed multi-view data, multiple sets of diverse features are measured on the same set of samples. By integrating all available data sources, we seek to discover common group structure among the samples that may be hidden in individualistic cluster analyses of a single data-view. While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that will inherit the strong statistical, mathematical and empirical properties of increasingly popular convex clustering methods. Specifically, our Integrative Generalized Convex Clustering Optimization (iGecco) method employs different convex distances, losses, or divergences for each of the different data views with a joint convex fusion penalty that leads to common groups. Additionally, integrating mixed multi-view data is often challenging when each data source is high-dimensional. To perform feature selection in such scenarios, we develop an adaptive shifted group-lasso penalty that selects features by shrinking them towards their loss-specific centers. Our so-called iGecco+ approach selects features from each data-view that are best for determining the groups, often leading to improved integrative clustering. To fit our model, we develop a new type of generalized multi-block ADMM algorithm using sub-problem approximations that more efficiently fits our model for big data sets. Through a series of numerical experiments and real data examples on text mining and genomics, we show that iGecco+ achieves superior empirical performance for high-dimensional mixed multi-view data.
Tasks	Feature Selection
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05449v1
PDF	https://arxiv.org/pdf/1912.05449v1.pdf
PWC	https://paperswithcode.com/paper/integrative-generalized-convex-clustering
Repo
Framework