Paper Group ANR 437
Characteristic Regularisation for Super-Resolving Face Images. Wasserstein Distance Based Domain Adaptation for Object Detection. Activation Adaptation in Neural Networks. Optimal initialization of K-means using Particle Swarm Optimization. Low-Rank Approximation from Communication Complexity. Memory-Augmented Recurrent Neural Networks Can Learn Ge …
Characteristic Regularisation for Super-Resolving Face Images
Title | Characteristic Regularisation for Super-Resolving Face Images |
Authors | Zhiyi Cheng, Xiatian Zhu, Shaogang Gong |
Abstract | Existing facial image super-resolution (SR) methods focus mostly on improving artificially down-sampled low-resolution (LR) imagery. Such SR models, although strong at handling artificial LR images, often suffer from significant performance drop on genuine LR test data. Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data as well as cycle consistency loss formulation. However, this renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution. Importantly, this makes the end-to-end model training ineffective due to the difficulty of back-propagating gradients through two concatenated CNNs. To solve this problem, we formulate a method that joins the advantages of conventional SR and UDA models. Specifically, we separate and control the optimisations for characteristics consistifying and image super-resolving by introducing Characteristic Regularisation (CR) between them. This task split makes the model training more effective and computationally tractable. Extensive evaluations demonstrate the performance superiority of our method over state-of-the-art SR and UDA models on both genuine and artificial LR facial imagery data. |
Tasks | Domain Adaptation, Image Super-Resolution, Super-Resolution, Unsupervised Domain Adaptation |
Published | 2019-12-30 |
URL | https://arxiv.org/abs/1912.12987v1 |
https://arxiv.org/pdf/1912.12987v1.pdf | |
PWC | https://paperswithcode.com/paper/characteristic-regularisation-for-super |
Repo | |
Framework | |
Wasserstein Distance Based Domain Adaptation for Object Detection
Title | Wasserstein Distance Based Domain Adaptation for Object Detection |
Authors | Pengcheng Xu, Prudhvi Gurram, Gene Whipps, Rama Chellappa |
Abstract | In this paper, we present an adversarial unsupervised domain adaptation framework for object detection. Prior approaches utilize adversarial training based on cross entropy between the source and target domain distributions to learn a shared feature mapping that minimizes the domain gap. Here, we minimize the Wasserstein distance between the two distributions instead of cross entropy or Jensen-Shannon divergence to improve the stability of domain adaptation in high-dimensional feature spaces that are inherent to object detection task. Additionally, we remove the exact consistency constraint of the shared feature mapping between the source and target domains, so that the target feature mapping can be optimized independently, which is necessary in the case of significant domain gap. We empirically show that the proposed framework can mitigate domain shift in different scenarios, and provide improved target domain object detection performance. |
Tasks | Domain Adaptation, Object Detection, Unsupervised Domain Adaptation |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08675v1 |
https://arxiv.org/pdf/1909.08675v1.pdf | |
PWC | https://paperswithcode.com/paper/wasserstein-distance-based-domain-adaptation |
Repo | |
Framework | |
Activation Adaptation in Neural Networks
Title | Activation Adaptation in Neural Networks |
Authors | Farnoush Farhadi, Vahid Partovi Nia, Andrea Lodi |
Abstract | Many neural network architectures rely on the choice of the activation function for each hidden layer. Given the activation function, the neural network is trained over the bias and the weight parameters. The bias catches the center of the activation, and the weights capture the scale. Here we propose to train the network over a shape parameter as well. This view allows each neuron to tune its own activation function and adapt the neuron curvature towards a better prediction. This modification only adds one further equation to the back-propagation for each neuron. Re-formalizing activation functions as CDF generalizes the class of activation function extensively. We aimed at generalizing an extensive class of activation functions to study: i) skewness and ii) smoothness of activation functions. Here we introduce adaptive Gumbel activation function as a bridge between Gumbel and sigmoid. A similar approach is used to invent a smooth version of ReLU. Our comparison with common activation functions suggests different data representation especially in early neural network layers. This adaptation also provides prediction improvement. |
Tasks | |
Published | 2019-01-28 |
URL | https://arxiv.org/abs/1901.09849v2 |
https://arxiv.org/pdf/1901.09849v2.pdf | |
PWC | https://paperswithcode.com/paper/activation-adaptation-in-neural-networks |
Repo | |
Framework | |
Optimal initialization of K-means using Particle Swarm Optimization
Title | Optimal initialization of K-means using Particle Swarm Optimization |
Authors | Ashutosh Mahesh Pednekar |
Abstract | This paper proposes the use of an optimization algorithm, namely PSO to decide the initial centroids in K-means, to eventually get better accuracy. The vectorized notation of the optimal centroids can be thought of as entities in an optimization space, where the accuracy of K-means over a random subset of the data could act as a fitness measure. The resultant optimal vector can be used as the initial centroids for K-means. |
Tasks | |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09098v1 |
http://arxiv.org/pdf/1904.09098v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-initialization-of-k-means-using |
Repo | |
Framework | |
Low-Rank Approximation from Communication Complexity
Title | Low-Rank Approximation from Communication Complexity |
Authors | Cameron Musco, Christopher Musco, David P. Woodruff |
Abstract | In masked low-rank approximation, given $A \in \mathbb{R}^{n \times n}$ and binary $W \in {0,1}^{n \times n}$, the goal is to find a rank-$k$ matrix $L$ for which: $$cost(L)=\sum_{i=1}^{n} \sum_{j=1}^{n}W_{i,j}\cdot (A_{i,j} - L_{i,j})^2\le OPT+\epsilon \A_F^2,$$ where $OPT=\min_{rank-k\ \hat{L}}cost(\hat L)$. This problem is a special case of weighted low-rank approximation and captures low-rank plus diagonal decomposition, robust PCA, matrix completion, low-rank recovery from monotone missing data, and many other problems. Many of these problems are NP-hard, and while some algorithms with provable guarantees are known, they either 1) run in time $n^{\Omega(k^2/\epsilon)}$, or 2) make strong assumptions, e.g., that $A$ is incoherent or that $W$ is random. We consider $bicriteria\ algorithms$, which output $L$ with rank $k’ > k$. We prove that a common heuristic, which simply sets $A$ to $0$ where $W$ is $0$, and then computes a standard low-rank approximation, achieves the above approximation bound with rank $k'$ depending on the $communication\ complexity$ of $W$. Namely, interpreting $W$ as the communication matrix of a Boolean function $f(x,y)$ with $x,y\in {0,1}^{\log n}$, it suffices to set $k'=O(k\cdot 2^{R^{1-sided}_{\epsilon}(f)})$, where $R^{1-sided}_{\epsilon}(f)$ is the randomized communication complexity of $f$ with $1$-sided error probability $\epsilon$. For many problems, this yields bicriteria algorithms with $k'=k\cdot poly((\log n)/\epsilon)$. Further, we show that different models of communication yield algorithms for natural variants of the problem. E.g., multi-player communication complexity connects to tensor decomposition and non-deterministic communication complexity to Boolean low-rank factorization. Finally, we conjecture a tight relationship between masked low-rank approximation and communication complexity and give some evidence in its direction. |
Tasks | Matrix Completion |
Published | 2019-04-22 |
URL | https://arxiv.org/abs/1904.09841v2 |
https://arxiv.org/pdf/1904.09841v2.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-approximation-from-communication |
Repo | |
Framework | |
Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages
Title | Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages |
Authors | Mirac Suzgun, Sebastian Gehrmann, Yonatan Belinkov, Stuart M. Shieber |
Abstract | We introduce three memory-augmented Recurrent Neural Networks (MARNNs) and explore their capabilities on a series of simple language modeling tasks whose solutions require stack-based mechanisms. We provide the first demonstration of neural networks recognizing the generalized Dyck languages, which express the core of what it means to be a language with hierarchical structure. Our memory-augmented architectures are easy to train in an end-to-end fashion and can learn the Dyck languages over as many as six parenthesis-pairs, in addition to two deterministic palindrome languages and the string-reversal transduction task, by emulating pushdown automata. Our experiments highlight the increased modeling capacity of memory-augmented models over simple RNNs, while inflecting our understanding of the limitations of these models. |
Tasks | Language Modelling |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03329v1 |
https://arxiv.org/pdf/1911.03329v1.pdf | |
PWC | https://paperswithcode.com/paper/memory-augmented-recurrent-neural-networks |
Repo | |
Framework | |
Evaluating the Search Phase of Neural Architecture Search
Title | Evaluating the Search Phase of Neural Architecture Search |
Authors | Kaicheng Yu, Christian Sciuto, Martin Jaggi, Claudiu Musat, Mathieu Salzmann |
Abstract | Neural Architecture Search (NAS) aims to facilitate the design of deep networks for new tasks. Existing techniques rely on two stages: searching over the architecture space and validating the best architecture. NAS algorithms are currently compared solely based on their results on the downstream task. While intuitive, this fails to explicitly evaluate the effectiveness of their search strategies. In this paper, we propose to evaluate the NAS search phase. To this end, we compare the quality of the solutions obtained by NAS search policies with that of random architecture selection. We find that: (i) On average, the state-of-the-art NAS algorithms perform similarly to the random policy; (ii) the widely-used weight sharing strategy degrades the ranking of the NAS candidates to the point of not reflecting their true performance, thus reducing the effectiveness of the search process. We believe that our evaluation framework will be key to designing NAS strategies that consistently discover architectures superior to random ones. |
Tasks | Neural Architecture Search |
Published | 2019-02-21 |
URL | https://arxiv.org/abs/1902.08142v3 |
https://arxiv.org/pdf/1902.08142v3.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-the-search-phase-of-neural |
Repo | |
Framework | |
ColorNet: Investigating the importance of color spaces for image classification
Title | ColorNet: Investigating the importance of color spaces for image classification |
Authors | Shreyank N Gowda, Chun Yuan |
Abstract | Image classification is a fundamental application in computer vision. Recently, deeper networks and highly connected networks have shown state of the art performance for image classification tasks. Most datasets these days consist of a finite number of color images. These color images are taken as input in the form of RGB images and classification is done without modifying them. We explore the importance of color spaces and show that color spaces (essentially transformations of original RGB images) can significantly affect classification accuracy. Further, we show that certain classes of images are better represented in particular color spaces and for a dataset with a highly varying number of classes such as CIFAR and Imagenet, using a model that considers multiple color spaces within the same model gives excellent levels of accuracy. Also, we show that such a model, where the input is preprocessed into multiple color spaces simultaneously, needs far fewer parameters to obtain high accuracy for classification. For example, our model with 1.75M parameters significantly outperforms DenseNet 100-12 that has 12M parameters and gives results comparable to Densenet-BC-190-40 that has 25.6M parameters for classification of four competitive image classification datasets namely: CIFAR-10, CIFAR-100, SVHN and Imagenet. Our model essentially takes an RGB image as input, simultaneously converts the image into 7 different color spaces and uses these as inputs to individual densenets. We use small and wide densenets to reduce computation overhead and number of hyperparameters required. We obtain significant improvement on current state of the art results on these datasets as well. |
Tasks | Image Classification |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00267v1 |
http://arxiv.org/pdf/1902.00267v1.pdf | |
PWC | https://paperswithcode.com/paper/colornet-investigating-the-importance-of |
Repo | |
Framework | |
Action recognition with spatial-temporal discriminative filter banks
Title | Action recognition with spatial-temporal discriminative filter banks |
Authors | Brais Martinez, Davide Modolo, Yuanjun Xiong, Joseph Tighe |
Abstract | Action recognition has seen a dramatic performance improvement in the last few years. Most of the current state-of-the-art literature either aims at improving performance through changes to the backbone CNN network, or they explore different trade-offs between computational efficiency and performance, again through altering the backbone network. However, almost all of these works maintain the same last layers of the network, which simply consist of a global average pooling followed by a fully connected layer. In this work we focus on how to improve the representation capacity of the network, but rather than altering the backbone, we focus on improving the last layers of the network, where changes have low impact in terms of computational cost. In particular, we show that current architectures have poor sensitivity to finer details and we exploit recent advances in the fine-grained recognition literature to improve our model in this aspect. With the proposed approach, we obtain state-of-the-art performance on Kinetics-400 and Something-Something-V1, the two major large-scale action recognition benchmarks. |
Tasks | Action Classification, Action Recognition In Videos |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07625v1 |
https://arxiv.org/pdf/1908.07625v1.pdf | |
PWC | https://paperswithcode.com/paper/190807625 |
Repo | |
Framework | |
advPattern: Physical-World Attacks on Deep Person Re-Identification via Adversarially Transformable Patterns
Title | advPattern: Physical-World Attacks on Deep Person Re-Identification via Adversarially Transformable Patterns |
Authors | Zhibo Wang, Siyan Zheng, Mengkai Song, Qian Wang, Alireza Rahimpour, Hairong Qi |
Abstract | Person re-identification (re-ID) is the task of matching person images across camera views, which plays an important role in surveillance and security applications. Inspired by great progress of deep learning, deep re-ID models began to be popular and gained state-of-the-art performance. However, recent works found that deep neural networks (DNNs) are vulnerable to adversarial examples, posing potential threats to DNNs based applications. This phenomenon throws a serious question about whether deep re-ID based systems are vulnerable to adversarial attacks. In this paper, we take the first attempt to implement robust physical-world attacks against deep re-ID. We propose a novel attack algorithm, called advPattern, for generating adversarial patterns on clothes, which learns the variations of image pairs across cameras to pull closer the image features from the same camera, while pushing features from different cameras farther. By wearing our crafted “invisible cloak”, an adversary can evade person search, or impersonate a target person to fool deep re-ID models in physical world. We evaluate the effectiveness of our transformable patterns on adversaries’clothes with Market1501 and our established PRCS dataset. The experimental results show that the rank-1 accuracy of re-ID models for matching the adversary decreases from 87.9% to 27.1% under Evading Attack. Furthermore, the adversary can impersonate a target person with 47.1% rank-1 accuracy and 67.9% mAP under Impersonation Attack. The results demonstrate that deep re-ID systems are vulnerable to our physical attacks. |
Tasks | Person Re-Identification, Person Search |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1908.09327v3 |
https://arxiv.org/pdf/1908.09327v3.pdf | |
PWC | https://paperswithcode.com/paper/advpattern-physical-world-attacks-on-deep |
Repo | |
Framework | |
Making targeted black-box evasion attacks effective and efficient
Title | Making targeted black-box evasion attacks effective and efficient |
Authors | Mika Juuti, Buse Gul Atli, N. Asokan |
Abstract | We investigate how an adversary can optimally use its query budget for targeted evasion attacks against deep neural networks in a black-box setting. We formalize the problem setting and systematically evaluate what benefits the adversary can gain by using substitute models. We show that there is an exploration-exploitation tradeoff in that query efficiency comes at the cost of effectiveness. We present two new attack strategies for using substitute models and show that they are as effective as previous query-only techniques but require significantly fewer queries, by up to three orders of magnitude. We also show that an agile adversary capable of switching through different attack techniques can achieve pareto-optimal efficiency. We demonstrate our attack against Google Cloud Vision showing that the difficulty of black-box attacks against real-world prediction APIs is significantly easier than previously thought (requiring approximately 500 queries instead of approximately 20,000 as in previous works). |
Tasks | |
Published | 2019-06-08 |
URL | https://arxiv.org/abs/1906.03397v1 |
https://arxiv.org/pdf/1906.03397v1.pdf | |
PWC | https://paperswithcode.com/paper/making-targeted-black-box-evasion-attacks |
Repo | |
Framework | |
Learning High-order Structural and Attribute information by Knowledge Graph Attention Networks for Enhancing Knowledge Graph Embedding
Title | Learning High-order Structural and Attribute information by Knowledge Graph Attention Networks for Enhancing Knowledge Graph Embedding |
Authors | Wenqiang Liu, Hongyun Cai, Xu Cheng, Sifa Xie, Yipeng Yu, Hanyu Zhang |
Abstract | The goal of representation learning of knowledge graph is to encode both entities and relations into a low-dimensional embedding spaces. Many recent works have demonstrated the benefits of knowledge graph embedding on knowledge graph completion task, such as relation extraction. However, we observe that: 1) existing method just take direct relations between entities into consideration and fails to express high-order structural relationship between entities; 2) these methods just leverage relation triples of KGs while ignoring a large number of attribute triples that encoding rich semantic information. To overcome these limitations, this paper propose a novel knowledge graph embedding method, named KANE, which is inspired by the recent developments of graph convolutional networks (GCN). KANE can capture both high-order structural and attribute information of KGs in an efficient, explicit and unified manner under the graph convolutional networks framework. Empirical results on three datasets show that KANE significantly outperforms seven state-of-arts methods. Further analysis verify the efficiency of our method and the benefits brought by the attention mechanism. |
Tasks | Graph Embedding, Knowledge Graph Completion, Knowledge Graph Embedding, Relation Extraction, Representation Learning |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03891v2 |
https://arxiv.org/pdf/1910.03891v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-high-order-structural-and-attribute |
Repo | |
Framework | |
Incremental Principal Component Analysis Exact implementation and continuity corrections
Title | Incremental Principal Component Analysis Exact implementation and continuity corrections |
Authors | Vittorio Lippi, Giacomo Ceccarelli |
Abstract | This paper describes some applications of an incremental implementation of the principal component analysis (PCA). The algorithm updates the transformation coefficients matrix on-line for each new sample, without the need to keep all the samples in memory. The algorithm is formally equivalent to the usual batch version, in the sense that given a sample set the transformation coefficients at the end of the process are the same. The implications of applying the PCA in real time are discussed with the help of data analysis examples. In particular we focus on the problem of the continuity of the PCs during an on-line analysis. |
Tasks | |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.07922v2 |
https://arxiv.org/pdf/1901.07922v2.pdf | |
PWC | https://paperswithcode.com/paper/incremental-principal-component-analysis |
Repo | |
Framework | |
Unbabel’s Participation in the WMT19 Translation Quality Estimation Shared Task
Title | Unbabel’s Participation in the WMT19 Translation Quality Estimation Shared Task |
Authors | Fabio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, António Góis, M. Amin Farajian, António V. Lopes, André F. T. Martins |
Abstract | We present the contribution of the Unbabel team to the WMT 2019 Shared Task on Quality Estimation. We participated on the word, sentence, and document-level tracks, encompassing 3 language pairs: English-German, English-Russian, and English-French. Our submissions build upon the recent OpenKiwi framework: we combine linear, neural, and predictor-estimator systems with new transfer learning approaches using BERT and XLM pre-trained models. We compare systems individually and propose new ensemble techniques for word and sentence-level predictions. We also propose a simple technique for converting word labels into document-level predictions. Overall, our submitted systems achieve the best results on all tracks and language pairs by a considerable margin. |
Tasks | Transfer Learning |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10352v2 |
https://arxiv.org/pdf/1907.10352v2.pdf | |
PWC | https://paperswithcode.com/paper/unbabels-participation-in-the-wmt19 |
Repo | |
Framework | |
TUNA-Net: Task-oriented UNsupervised Adversarial Network for Disease Recognition in Cross-Domain Chest X-rays
Title | TUNA-Net: Task-oriented UNsupervised Adversarial Network for Disease Recognition in Cross-Domain Chest X-rays |
Authors | Yuxing Tang, Youbao Tang, Veit Sandfort, Jing Xiao, Ronald M. Summers |
Abstract | In this work, we exploit the unsupervised domain adaptation problem for radiology image interpretation across domains. Specifically, we study how to adapt the disease recognition model from a labeled source domain to an unlabeled target domain, so as to reduce the effort of labeling each new dataset. To address the shortcoming of cross-domain, unpaired image-to-image translation methods which typically ignore class-specific semantics, we propose a task-driven, discriminatively trained, cycle-consistent generative adversarial network, termed TUNA-Net. It is able to preserve 1) low-level details, 2) high-level semantic information and 3) mid-level feature representation during the image-to-image translation process, to favor the target disease recognition task. The TUNA-Net framework is general and can be readily adapted to other learning tasks. We evaluate the proposed framework on two public chest X-ray datasets for pneumonia recognition. The TUNA-Net model can adapt labeled adult chest X-rays in the source domain such that they appear as if they were drawn from pediatric X-rays in the unlabeled target domain, while preserving the disease semantics. Extensive experiments show the superiority of the proposed method as compared to state-of-the-art unsupervised domain adaptation approaches. Notably, TUNA-Net achieves an AUC of 96.3% for pediatric pneumonia classification, which is very close to that of the supervised approach (98.1%), but without the need for labels on the target domain. |
Tasks | Domain Adaptation, Image-to-Image Translation, Unsupervised Domain Adaptation |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.07926v1 |
https://arxiv.org/pdf/1908.07926v1.pdf | |
PWC | https://paperswithcode.com/paper/190807926 |
Repo | |
Framework | |