Paper Group ANR 969
Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator. k-Relevance Vectors for Pattern Classification. Cross-lingual Visual Verb Sense Disambiguation. AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design. Image Captioning with Sparse Recurrent Neural Networ …
Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator
Title | Investigation of Initialization Strategies for the Multiple Instance Adaptive Cosine Estimator |
Authors | James Bocinsky, Connor McCurley, Daniel Shats, Alina Zare |
Abstract | Sensors which use electromagnetic induction (EMI) to excite a response in conducting bodies have long been investigated for subsurface explosive hazard detection. In particular, EMI sensors have been used to discriminate between different types of objects, and to detect objects with low metal content. One successful, previously investigated approach is the Multiple Instance Adaptive Cosine Estimator (MI-ACE). In this paper, a number of new initialization techniques for MI-ACE are proposed and evaluated using their respective performance and speed. The cross validated learned signatures, as well as learned background statistics, are used with Adaptive Cosine Estimator (ACE) to generate confidence maps, which are clustered into alarms. Alarms are scored against a ground truth and the initialization approaches are compared. |
Tasks | |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1904.13197v1 |
http://arxiv.org/pdf/1904.13197v1.pdf | |
PWC | https://paperswithcode.com/paper/investigation-of-initialization-strategies |
Repo | |
Framework | |
k-Relevance Vectors for Pattern Classification
Title | k-Relevance Vectors for Pattern Classification |
Authors | Peyman Hosseinzadeh Kassani, Sara Hosseinzadeh Kassani |
Abstract | This study combines two different learning paradigms, k-nearest neighbor (k-NN) rule, as memory-based learning paradigm and relevance vector machines (RVM), as statistical learning paradigm. This combination is performed in kernel space and is called k-relevance vector (k-RV). The purpose is to improve the performance of k-NN rule. The proposed model significantly prunes irrelevant attributes. We also introduced a new parameter, responsible for early stopping of iterations in RVM. We show that the new parameter improves the classification accuracy of k-RV. Intensive experiments are conducted on several classification datasets from University of California Irvine (UCI) repository and two real datasets from computer vision domain. The performance of k-RV is highly competitive compared to a few state-of-the-arts in terms of classification accuracy. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08528v1 |
https://arxiv.org/pdf/1909.08528v1.pdf | |
PWC | https://paperswithcode.com/paper/k-relevance-vectors-for-pattern |
Repo | |
Framework | |
Cross-lingual Visual Verb Sense Disambiguation
Title | Cross-lingual Visual Verb Sense Disambiguation |
Authors | Spandana Gella, Desmond Elliott, Frank Keller |
Abstract | Recent work has shown that visual context improves cross-lingual sense disambiguation for nouns. We extend this line of work to the more challenging task of cross-lingual verb sense disambiguation, introducing the MultiSense dataset of 9,504 images annotated with English, German, and Spanish verbs. Each image in MultiSense is annotated with an English verb and its translation in German or Spanish. We show that cross-lingual verb sense disambiguation models benefit from visual context, compared to unimodal baselines. We also show that the verb sense predicted by our best disambiguation model can improve the results of a text-only machine translation system when used for a multimodal translation task. |
Tasks | Machine Translation |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05092v2 |
http://arxiv.org/pdf/1904.05092v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-lingual-visual-verb-sense |
Repo | |
Framework | |
AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design
Title | AttoNets: Compact and Efficient Deep Neural Networks for the Edge via Human-Machine Collaborative Design |
Authors | Alexander Wong, Zhong Qiu Lin, Brendan Chwyl |
Abstract | While deep neural networks have achieved state-of-the-art performance across a large number of complex tasks, it remains a big challenge to deploy such networks for practical, on-device edge scenarios such as on mobile devices, consumer devices, drones, and vehicles. In this study, we take a deeper exploration into a human-machine collaborative design approach for creating highly efficient deep neural networks through a synergy between principled network design prototyping and machine-driven design exploration. The efficacy of human-machine collaborative design is demonstrated through the creation of AttoNets, a family of highly efficient deep neural networks for on-device edge deep learning. Each AttoNet possesses a human-specified network-level macro-architecture comprising of custom modules with unique machine-designed module-level macro-architecture and micro-architecture designs, all driven by human-specified design requirements. Experimental results for the task of object recognition showed that the AttoNets created via human-machine collaborative design has significantly fewer parameters and computational costs than state-of-the-art networks designed for efficiency while achieving noticeably higher accuracy (with the smallest AttoNet achieving ~1.8% higher accuracy while requiring ~10x fewer multiply-add operations and parameters than MobileNet-V1). Furthermore, the efficacy of the AttoNets is demonstrated for the task of instance-level object segmentation and object detection, where an AttoNet-based Mask R-CNN network was constructed with significantly fewer parameters and computational costs (~5x fewer multiply-add operations and ~2x fewer parameters) than a ResNet-50 based Mask R-CNN network. |
Tasks | Object Detection, Object Recognition, Semantic Segmentation |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07209v2 |
http://arxiv.org/pdf/1903.07209v2.pdf | |
PWC | https://paperswithcode.com/paper/attonets-compact-and-efficient-deep-neural |
Repo | |
Framework | |
Image Captioning with Sparse Recurrent Neural Network
Title | Image Captioning with Sparse Recurrent Neural Network |
Authors | Jia Huei Tan, Chee Seng Chan, Joon Huang Chuah |
Abstract | Recurrent Neural Network (RNN) has been widely used to tackle a wide variety of language generation problems and are capable of attaining state-of-the-art (SOTA) performance. However despite its impressive results, the large number of parameters in the RNN model makes deployment to mobile and embedded devices infeasible. Driven by this problem, many works have proposed a number of pruning methods to reduce the sizes of the RNN model. In this work, we propose an end-to-end pruning method for image captioning models equipped with visual attention. Our proposed method is able to achieve sparsity levels up to 97.5% without significant performance loss relative to the baseline (~ 2% loss at 40x compression after fine-tuning). Our method is also simple to use and tune, facilitating faster development times for neural network practitioners. We perform extensive experiments on the popular MS-COCO dataset in order to empirically validate the efficacy of our proposed method. |
Tasks | Image Captioning, Text Generation |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10797v2 |
https://arxiv.org/pdf/1908.10797v2.pdf | |
PWC | https://paperswithcode.com/paper/image-captioning-with-sparse-recurrent-neural |
Repo | |
Framework | |
Optimizing Controller Placement for Software-Defined Networks
Title | Optimizing Controller Placement for Software-Defined Networks |
Authors | Victoria Huang, Gang Chen, Qiang Fu, Elliott Wen |
Abstract | Controller placement problem (CPP) is a key issue for Software-Defined Networking (SDN) with distributed controller architectures. This problem aims to determine a suitable number of controllers deployed in important locations so as to optimize the overall network performance. In comparison to communication delay, existing literature on the CPP assumes that the influence of controller workload distribution on network performance is negligible. In this paper, we tackle the CPP that simultaneously considers the communication delay, the control plane utilization, and the controller workload distribution. Due to this reason, our CPP is intrinsically different from and clearly more difficult than any previously studied CPPs that are NP-hard. To tackle this challenging issue, we develop a new algorithm that seamlessly integrates the genetic algorithm (GA) and the gradient descent (GD) optimization method. Particularly, GA is used to search for suitable CPP solutions. The quality of each solution is further evaluated through GD. Simulation results on two representative network scenarios (small-scale and large-scale) show that our algorithm can effectively strike the trade-off between the control plane utilization and the network response time. |
Tasks | |
Published | 2019-02-14 |
URL | http://arxiv.org/abs/1902.09451v1 |
http://arxiv.org/pdf/1902.09451v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-controller-placement-for-software |
Repo | |
Framework | |
Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise
Title | Joint DNN-Based Multichannel Reduction of Acoustic Echo, Reverberation and Noise |
Authors | Guillaume Carbajal, Romain Serizel, Emmanuel Vincent, Eric Humbert |
Abstract | We consider the problem of simultaneous reduction of acoustic echo, reverberation and noise. In real scenarios, these distortion sources may occur simultaneously and reducing them implies combining the corresponding distortion-specific filters. As these filters interact with each other, they must be jointly optimized. We propose to model the target and residual signals after linear echo cancellation and dereverberation using a multichannel Gaussian modeling framework and to jointly represent their spectra by means of a neural network. We develop an iterative block-coordinate ascent algorithm to update all the filters. We evaluate our system on real recordings of acoustic echo, reverberation and noise acquired with a smart speaker in various situations. The proposed approach outperforms in terms of overall distortion a cascade of the individual approaches and a joint reduction approach which does not rely on a spectral model of the target and residual signals. |
Tasks | |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08934v2 |
https://arxiv.org/pdf/1911.08934v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-dnn-based-multichannel-reduction-of |
Repo | |
Framework | |
Subword-Level Language Identification for Intra-Word Code-Switching
Title | Subword-Level Language Identification for Intra-Word Code-Switching |
Authors | Manuel Mager, Özlem Çetinoğlu, Katharina Kann |
Abstract | Language identification for code-switching (CS), the phenomenon of alternating between two or more languages in conversations, has traditionally been approached under the assumption of a single language per token. However, if at least one language is morphologically rich, a large number of words can be composed of morphemes from more than one language (intra-word CS). In this paper, we extend the language identification task to the subword-level, such that it includes splitting mixed words while tagging each part with a language ID. We further propose a model for this task, which is based on a segmental recurrent neural network. In experiments on a new Spanish–Wixarika dataset and on an adapted German–Turkish dataset, our proposed model performs slightly better than or roughly on par with our best baseline, respectively. Considering only mixed words, however, it strongly outperforms all baselines. |
Tasks | Language Identification |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.01989v1 |
http://arxiv.org/pdf/1904.01989v1.pdf | |
PWC | https://paperswithcode.com/paper/subword-level-language-identification-for |
Repo | |
Framework | |
Does Data Augmentation Lead to Positive Margin?
Title | Does Data Augmentation Lead to Positive Margin? |
Authors | Shashank Rajput, Zhili Feng, Zachary Charles, Po-Ling Loh, Dimitris Papailiopoulos |
Abstract | Data augmentation (DA) is commonly used during model training, as it significantly improves test error and model robustness. DA artificially expands the training set by applying random noise, rotations, crops, or even adversarial perturbations to the input data. Although DA is widely used, its capacity to provably improve robustness is not fully understood. In this work, we analyze the robustness that DA begets by quantifying the margin that DA enforces on empirical risk minimizers. We first focus on linear separators, and then a class of nonlinear models whose labeling is constant within small convex hulls of data points. We present lower bounds on the number of augmented data points required for non-zero margin, and show that commonly used DA techniques may only introduce significant margin after adding exponentially many points to the data set. |
Tasks | Data Augmentation |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.03177v1 |
https://arxiv.org/pdf/1905.03177v1.pdf | |
PWC | https://paperswithcode.com/paper/does-data-augmentation-lead-to-positive |
Repo | |
Framework | |
Exponential Slowdown for Larger Populations: The $(μ+1)$-EA on Monotone Functions
Title | Exponential Slowdown for Larger Populations: The $(μ+1)$-EA on Monotone Functions |
Authors | Johannes Lengler, Xun Zou |
Abstract | Pseudo-Boolean monotone functions are unimodal functions which are trivial to optimize for some hillclimbers, but are challenging for a surprising number of evolutionary algorithms (EAs). A general trend is that EAs are efficient if parameters like the mutation rate are set conservatively, but may need exponential time otherwise. In particular, it was known that the $(1+1)$-EA and the $(1+\lambda)$-EA can optimize every monotone function in pseudolinear time if the mutation rate is $c/n$ for some $c<1$, but they need exponential time for some monotone functions for $c>2.2$. The second part of the statement was also known for the $(\mu+1)$-EA. In this paper we show that the first statement does not apply to the $(\mu+1)$-EA. More precisely, we prove that for every constant $c>0$ there is a constant integer $\mu_0$ such that the $(\mu+1)$-EA with mutation rate $c/n$ and population size $\mu_0\le\mu\le n$ needs superpolynomial time to optimize some monotone functions. Thus, increasing the population size by just a constant has devastating effects on the performance. This is in stark contrast to many other benchmark functions on which increasing the population size either increases the performance significantly, or affects performance mildly. The reason why larger populations are harmful lies in the fact that larger populations may temporarily decrease selective pressure on parts of the population. This allows unfavorable mutations to accumulate in single individuals and their descendants. If the population moves sufficiently fast through the search space, such unfavorable descendants can become ancestors of future generations, and the bad mutations are preserved. Remarkably, this effect only occurs if the population renews itself sufficiently fast, which can only happen far away from the optimum. This is counter-intuitive since usually optimization gets harder as we approach the optimum. |
Tasks | |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.12821v1 |
https://arxiv.org/pdf/1907.12821v1.pdf | |
PWC | https://paperswithcode.com/paper/exponential-slowdown-for-larger-populations |
Repo | |
Framework | |
Finite-State Extreme Effect Variable
Title | Finite-State Extreme Effect Variable |
Authors | Alexey Drutsa |
Abstract | We generalize to the finite-state case the notion of the extreme effect variable $Y$ that accumulates all the effect of a variant variable $V$ observed in changes of another variable $X$. We conduct theoretical analysis and turn the problem of finding of an effect variable into a problem of a simultaneous decomposition of a set of distributions. The states of the extreme effect variable, on the one hand, are minimally affected by the variant variable $V$ and, on the other hand, are extremely different with respect to the observable variable $X$. We apply our technique to online evaluation of a web search engine through A/B testing and show its utility. |
Tasks | |
Published | 2019-12-24 |
URL | https://arxiv.org/abs/1912.13377v1 |
https://arxiv.org/pdf/1912.13377v1.pdf | |
PWC | https://paperswithcode.com/paper/finite-state-extreme-effect-variable |
Repo | |
Framework | |
Cross-Modal Subspace Learning with Scheduled Adaptive Margin Constraints
Title | Cross-Modal Subspace Learning with Scheduled Adaptive Margin Constraints |
Authors | David Semedo, João Magalhães |
Abstract | Cross-modal embeddings, between textual and visual modalities, aim to organise multimodal instances by their semantic correlations. State-of-the-art approaches use maximum-margin methods, based on the hinge-loss, to enforce a constant margin m, to separate projections of multimodal instances from different categories. In this paper, we propose a novel scheduled adaptive maximum-margin (SAM) formulation that infers triplet-specific constraints during training, therefore organising instances by adaptively enforcing inter-category and inter-modality correlations. This is supported by a scheduled adaptive margin function, that is smoothly activated, replacing a static margin by an adaptively inferred one reflecting triplet-specific semantic correlations while accounting for the incremental learning behaviour of neural networks to enforce category cluster formation and enforcement. Experiments on widely used datasets show that our model improved upon state-of-the-art approaches, by achieving a relative improvement of up to ~12.5% over the second best method, thus confirming the effectiveness of our scheduled adaptive margin formulation. |
Tasks | |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1909.13733v1 |
https://arxiv.org/pdf/1909.13733v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-modal-subspace-learning-with-scheduled |
Repo | |
Framework | |
A Cross-Modal Image Fusion Theory Guided by Human Visual Characteristics
Title | A Cross-Modal Image Fusion Theory Guided by Human Visual Characteristics |
Authors | Aiqing Fang, Xinbo Zhao, Yanning Zhang |
Abstract | The characteristics of feature selection, nonlinear combination and multi-task auxiliary learning mechanism of the human visual perception system play an important role in real-world scenarios, but the research of image fusion theory based on the characteristics of human visual perception is less. Inspired by the characteristics of human visual perception, we propose a robust multi-task auxiliary learning optimization image fusion theory. Firstly, we combine channel attention model with nonlinear convolutional neural network to select features and fuse nonlinear features. Then, we analyze the impact of the existing image fusion loss on the image fusion quality, and establish the multi-loss function model of unsupervised learning network. Secondly, aiming at the multi-task auxiliary learning mechanism of human visual perception system, we study the influence of multi-task auxiliary learning mechanism on image fusion task on the basis of single task multi-loss network model. By simulating the three characteristics of human visual perception system, the fused image is more consistent with the mechanism of human brain image fusion. Finally, in order to verify the superiority of our algorithm, we carried out experiments on the combined vision system image data set, and extended our algorithm to the infrared and visible image and the multi-focus image public data set for experimental verification. The experimental results demonstrate the superiority of our fusion theory over state-of-arts in generality and robustness. |
Tasks | Auxiliary Learning, Feature Selection |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08577v2 |
https://arxiv.org/pdf/1912.08577v2.pdf | |
PWC | https://paperswithcode.com/paper/a-cross-modal-image-fusion-theory-guided-by |
Repo | |
Framework | |
Robust Group Synchronization via Cycle-Edge Message Passing
Title | Robust Group Synchronization via Cycle-Edge Message Passing |
Authors | Gilad Lerman, Yunpeng Shi |
Abstract | We propose a general framework for group synchronization with adversarial corruption and sufficiently small noise. Specifically, we apply a novel message passing procedure that uses cycle consistency information in order to estimate the corruption levels of group ratios and consequently infer the corrupted group ratios and solve the synchronization problem. We first explain why the group cycle consistency information is essential for effectively solving group synchronization problems. We then establish exact recovery and linear convergence guarantees for the proposed message passing procedure under a deterministic setting with adversarial corruption. These guarantees hold as long as the ratio of corrupted cycles per edge is bounded by a reasonable constant. We also establish the stability of the proposed procedure to sub-Gaussian noise. We further show that under a uniform corruption model, the recovery results are sharp in terms of an information-theoretic bound. |
Tasks | |
Published | 2019-12-24 |
URL | https://arxiv.org/abs/1912.11347v1 |
https://arxiv.org/pdf/1912.11347v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-group-synchronization-via-cycle-edge |
Repo | |
Framework | |
Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data
Title | Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data |
Authors | Minjie Wang, Genevera I. Allen |
Abstract | In mixed multi-view data, multiple sets of diverse features are measured on the same set of samples. By integrating all available data sources, we seek to discover common group structure among the samples that may be hidden in individualistic cluster analyses of a single data-view. While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that will inherit the strong statistical, mathematical and empirical properties of increasingly popular convex clustering methods. Specifically, our Integrative Generalized Convex Clustering Optimization (iGecco) method employs different convex distances, losses, or divergences for each of the different data views with a joint convex fusion penalty that leads to common groups. Additionally, integrating mixed multi-view data is often challenging when each data source is high-dimensional. To perform feature selection in such scenarios, we develop an adaptive shifted group-lasso penalty that selects features by shrinking them towards their loss-specific centers. Our so-called iGecco+ approach selects features from each data-view that are best for determining the groups, often leading to improved integrative clustering. To fit our model, we develop a new type of generalized multi-block ADMM algorithm using sub-problem approximations that more efficiently fits our model for big data sets. Through a series of numerical experiments and real data examples on text mining and genomics, we show that iGecco+ achieves superior empirical performance for high-dimensional mixed multi-view data. |
Tasks | Feature Selection |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05449v1 |
https://arxiv.org/pdf/1912.05449v1.pdf | |
PWC | https://paperswithcode.com/paper/integrative-generalized-convex-clustering |
Repo | |
Framework | |