January 31, 2020

3404 words 16 mins read

Paper Group AWR 456

FasterSeg: Searching for Faster Real-time Semantic Segmentation. BOBBY2: Buffer Based Robust High-Speed Object Tracking. Unsupervised Neural Quantization for Compressed-Domain Similarity Search. Revealing the Importance of Semantic Retrieval for Machine Reading at Scale. Random Pairwise Shapelets Forest. SurfCon: Synonym Discovery on Privacy-Aware …

FasterSeg: Searching for Faster Real-time Semantic Segmentation


Title	FasterSeg: Searching for Faster Real-time Semantic Segmentation
Authors	Wuyang Chen, Xinyu Gong, Xianming Liu, Qian Zhang, Yuan Li, Zhangyang Wang
Abstract	We present FasterSeg, an automatically designed semantic segmentation network with not only state-of-the-art performance but also faster speed than current methods. Utilizing neural architecture search (NAS), FasterSeg is discovered from a novel and broader search space integrating multi-resolution branches, that has been recently found to be vital in manually designed segmentation models. To better calibrate the balance between the goals of high accuracy and low latency, we propose a decoupled and fine-grained latency regularization, that effectively overcomes our observed phenomenons that the searched networks are prone to “collapsing” to low-latency yet poor-accuracy models. Moreover, we seamlessly extend FasterSeg to a new collaborative search (co-searching) framework, simultaneously searching for a teacher and a student network in the same single run. The teacher-student distillation further boosts the student model’s accuracy. Experiments on popular segmentation benchmarks demonstrate the competency of FasterSeg. For example, FasterSeg can run over 30% faster than the closest manually designed competitor on Cityscapes, while maintaining comparable accuracy.
Tasks	Neural Architecture Search, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10917v2
PDF	https://arxiv.org/pdf/1912.10917v2.pdf
PWC	https://paperswithcode.com/paper/fasterseg-searching-for-faster-real-time-1
Repo	https://github.com/TAMU-VITA/FasterSeg
Framework	pytorch

BOBBY2: Buffer Based Robust High-Speed Object Tracking


Title	BOBBY2: Buffer Based Robust High-Speed Object Tracking
Authors	Keifer Lee, Jun Jet Tai, Swee King Phang
Abstract	In this work, a novel high-speed single object tracker that is robust against non-semantic distractor exemplars is introduced; dubbed BOBBY2. It incorporates a novel exemplar buffer module that sparsely caches the target’s appearance across time, enabling it to adapt to potential target deformation. As for training, an augmented ImageNet-VID dataset was used in conjunction with the one cycle policy, enabling it to reach convergence with less than 2 epoch worth of data. For validation, the model was benchmarked on the GOT-10k dataset and on an additional small, albeit challenging custom UAV dataset collected with the TU-3 UAV. We demonstrate that the exemplar buffer is capable of providing redundancies in case of unintended target drifts, a desirable trait in any middle to long term tracking. Even when the buffer is predominantly filled with distractors instead of valid exemplars, BOBBY2 is capable of maintaining a near-optimal level of accuracy. BOBBY2 manages to achieve a very competitive result on the GOT-10k dataset and to a lesser degree on the challenging custom TU-3 dataset, without fine-tuning, demonstrating its generalizability. In terms of speed, BOBBY2 utilizes a stripped down AlexNet as feature extractor with 63% less parameters than a vanilla AlexNet, thus being able to run at a competitive 85 FPS.
Tasks	Object Tracking
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08263v1
PDF	https://arxiv.org/pdf/1910.08263v1.pdf
PWC	https://paperswithcode.com/paper/bobby2-buffer-based-robust-high-speed-object
Repo	https://github.com/datacrisis/BOBBY2
Framework	pytorch

Unsupervised Neural Quantization for Compressed-Domain Similarity Search


Title	Unsupervised Neural Quantization for Compressed-Domain Similarity Search
Authors	Stanislav Morozov, Artem Babenko
Abstract	We tackle the problem of unsupervised visual descriptors compression, which is a key ingredient of large-scale image retrieval systems. While the deep learning machinery has benefited literally all computer vision pipelines, the existing state-of-the-art compression methods employ shallow architectures, and we aim to close this gap by our paper. In more detail, we introduce a DNN architecture for the unsupervised compressed-domain retrieval, based on multi-codebook quantization. The proposed architecture is designed to incorporate both fast data encoding and efficient distances computation via lookup tables. We demonstrate the exceptional advantage of our scheme over existing quantization approaches on several datasets of visual descriptors via outperforming the previous state-of-the-art by a large margin.
Tasks	Image Retrieval, Quantization
Published	2019-08-11
URL	https://arxiv.org/abs/1908.03883v1
PDF	https://arxiv.org/pdf/1908.03883v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-neural-quantization-for
Repo	https://github.com/stanis-morozov/unq
Framework	pytorch

Revealing the Importance of Semantic Retrieval for Machine Reading at Scale


Title	Revealing the Importance of Semantic Retrieval for Machine Reading at Scale
Authors	Yixin Nie, Songhe Wang, Mohit Bansal
Abstract	Machine Reading at Scale (MRS) is a challenging task in which a system is given an input query and is asked to produce a precise output by “reading” information from a large knowledge base. The task has gained popularity with its natural combination of information retrieval (IR) and machine comprehension (MC). Advancements in representation learning have led to separated progress in both IR and MC; however, very few studies have examined the relationship and combined design of retrieval and comprehension at different levels of granularity, for development of MRS systems. In this work, we give general guidelines on system design for MRS by proposing a simple yet effective pipeline system with special consideration on hierarchical semantic retrieval at both paragraph and sentence level, and their potential effects on the downstream task. The system is evaluated on both fact verification and open-domain multihop QA, achieving state-of-the-art results on the leaderboard test sets of both FEVER and HOTPOTQA. To further demonstrate the importance of semantic retrieval, we present ablation and analysis studies to quantify the contribution of neural retrieval modules at both paragraph-level and sentence-level, and illustrate that intermediate semantic retrieval modules are vital for not only effectively filtering upstream information and thus saving downstream computation, but also for shaping upstream data distribution and providing better data for downstream modeling. Code/data made publicly available at: https://github.com/easonnie/semanticRetrievalMRS
Tasks	Information Retrieval, Reading Comprehension, Representation Learning
Published	2019-09-17
URL	https://arxiv.org/abs/1909.08041v1
PDF	https://arxiv.org/pdf/1909.08041v1.pdf
PWC	https://paperswithcode.com/paper/revealing-the-importance-of-semantic
Repo	https://github.com/easonnie/semanticRetrievalMRS
Framework	pytorch

Random Pairwise Shapelets Forest


Title	Random Pairwise Shapelets Forest
Authors	Mohan Shi, Zhihai Wang, Jodong Yuan, Haiyang Liu
Abstract	Shapelet is a discriminative subsequence of time series. An advanced shapelet-based method is to embed shapelet into accurate and fast random forest. However, it shows several limitations. First, random shapelet forest requires a large training cost for split threshold searching. Second, a single shapelet provides limited information for only one branch of the decision tree, resulting in insufficient accuracy and interpretability. Third, randomized ensemble causes interpretability declining. For that, this paper presents Random Pairwise Shapelets Forest (RPSF). RPSF combines a pair of shapelets from different classes to construct random forest. It omits threshold searching to be more efficient, includes more information for each node of the forest to be more effective. Moreover, a discriminability metric, Decomposed Mean Decrease Impurity (DMDI), is proposed to identify influential region for every class. Extensive experiments show RPSF improves the accuracy and training speed of shapelet-based forest. Case studies demonstrate the interpretability of our method.
Tasks	Time Series
Published	2019-03-19
URL	http://arxiv.org/abs/1903.07799v2
PDF	http://arxiv.org/pdf/1903.07799v2.pdf
PWC	https://paperswithcode.com/paper/random-pairwise-shapelets-forest
Repo	https://github.com/nephashi/RandomPairwiseShapeletsForest
Framework	none

SurfCon: Synonym Discovery on Privacy-Aware Clinical Data


Title	SurfCon: Synonym Discovery on Privacy-Aware Clinical Data
Authors	Zhen Wang, Xiang Yue, Soheil Moosavinasab, Yungui Huang, Simon Lin, Huan Sun
Abstract	Unstructured clinical texts contain rich health-related information. To better utilize the knowledge buried in clinical texts, discovering synonyms for a medical query term has become an important task. Recent automatic synonym discovery methods leveraging raw text information have been developed. However, to preserve patient privacy and security, it is usually quite difficult to get access to large-scale raw clinical texts. In this paper, we study a new setting named synonym discovery on privacy-aware clinical data (i.e., medical terms extracted from the clinical texts and their aggregated co-occurrence counts, without raw clinical texts). To solve the problem, we propose a new framework SurfCon that leverages two important types of information in the privacy-aware clinical data, i.e., the surface form information, and the global context information for synonym discovery. In particular, the surface form module enables us to detect synonyms that look similar while the global context module plays a complementary role to discover synonyms that are semantically similar but in different surface forms, and both allow us to deal with the OOV query issue (i.e., when the query is not found in the given data). We conduct extensive experiments and case studies on publicly available privacy-aware clinical data, and show that SurfCon can outperform strong baseline methods by large margins under various settings.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09285v1
PDF	https://arxiv.org/pdf/1906.09285v1.pdf
PWC	https://paperswithcode.com/paper/surfcon-synonym-discovery-on-privacy-aware
Repo	https://github.com/yzabc007/SurfCon
Framework	pytorch

Representation Learning with Statistical Independence to Mitigate Bias


Title	Representation Learning with Statistical Independence to Mitigate Bias
Authors	Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl
Abstract	Presence of bias (in datasets or tasks) is inarguably one of the most critical challenges in machine learning applications that has alluded to pivotal debates in recent years. Such challenges range from spurious associations between variables in medical studies to the bias of race in gender or face recognition systems. Controlling for all types of biases in the dataset curation stage is cumbersome and sometimes impossible. The alternative is to use the available data and build models incorporating fair representation learning. In this paper, we propose such a model based on adversarial training with two competing objectives to learn features that have (1) maximum discriminative power with respect to the task and (2) minimal statistical mean dependence with the protected (bias) variable(s). Our approach does so by incorporating a new adversarial loss function that encourages a vanished correlation between the bias and the learned features. We apply our method to synthetic data, medical images (containing task bias), and a dataset for gender classification (containing dataset bias). Our results show that the learned features by our method not only result in superior prediction performance but also are unbiased. The code is available at https://github.com/QingyuZhao/BR-Net/.
Tasks	Face Recognition, Representation Learning
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03676v2
PDF	https://arxiv.org/pdf/1910.03676v2.pdf
PWC	https://paperswithcode.com/paper/bias-resilient-neural-network
Repo	https://github.com/QingyuZhao/BR-Net
Framework	none

AmazonQA: A Review-Based Question Answering Task


Title	AmazonQA: A Review-Based Question Answering Task
Authors	Mansi Gupta, Nitish Kulkarni, Raghuveer Chanda, Anirudha Rayasam, Zachary C Lipton
Abstract	Every day, thousands of customers post questions on Amazon product pages. After some time, if they are fortunate, a knowledgeable customer might answer their question. Observing that many questions can be answered based upon the available product reviews, we propose the task of review-based QA. Given a corpus of reviews and a question, the QA system synthesizes an answer. To this end, we introduce a new dataset and propose a method that combines information retrieval techniques for selecting relevant reviews (given a question) and “reading comprehension” models for synthesizing an answer (given a question and review). Our dataset consists of 923k questions, 3.6M answers and 14M reviews across 156k products. Building on the well-known Amazon dataset, we collect additional annotations, marking each question as either answerable or unanswerable based on the available reviews. A deployed system could first classify a question as answerable and then attempt to generate an answer. Notably, unlike many popular QA datasets, here, the questions, passages, and answers are all extracted from real human interactions. We evaluate numerous models for answer generation and propose strong baselines, demonstrating the challenging nature of this new task.
Tasks	Information Retrieval, Question Answering, Reading Comprehension
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04364v2
PDF	https://arxiv.org/pdf/1908.04364v2.pdf
PWC	https://paperswithcode.com/paper/amazonqa-a-review-based-question-answering
Repo	https://github.com/amazonqa/amazonqa
Framework	tf

U-Net Fixed-Point Quantization for Medical Image Segmentation


Title	U-Net Fixed-Point Quantization for Medical Image Segmentation
Authors	MohammadHossein AskariHemmat, Sina Honari, Lucas Rouhier, Christian S. Perone, Julien Cohen-Adad, Yvon Savaria, Jean-Pierre David
Abstract	Model quantization is leveraged to reduce the memory consumption and the computation time of deep neural networks. This is achieved by representing weights and activations with a lower bit resolution when compared to their high precision floating point counterparts. The suitable level of quantization is directly related to the model performance. Lowering the quantization precision (e.g. 2 bits), reduces the amount of memory required to store model parameters and the amount of logic required to implement computational blocks, which contributes to reducing the power consumption of the entire system. These benefits typically come at the cost of reduced accuracy. The main challenge is to quantize a network as much as possible, while maintaining the performance accuracy. In this work, we present a quantization method for the U-Net architecture, a popular model in medical image segmentation. We then apply our quantization algorithm to three datasets: (1) the Spinal Cord Gray Matter Segmentation (GM), (2) the ISBI challenge for segmentation of neuronal structures in Electron Microscopic (EM), and (3) the public National Institute of Health (NIH) dataset for pancreas segmentation in abdominal CT scans. The reported results demonstrate that with only 4 bits for weights and 6 bits for activations, we obtain 8 fold reduction in memory requirements while loosing only 2.21%, 0.57% and 2.09% dice overlap score for EM, GM and NIH datasets respectively. Our fixed point quantization provides a flexible trade off between accuracy and memory requirement which is not provided by previous quantization methods for U-Net such as TernaryNet.
Tasks	Medical Image Segmentation, Pancreas Segmentation, Quantization, Semantic Segmentation
Published	2019-08-02
URL	https://arxiv.org/abs/1908.01073v2
PDF	https://arxiv.org/pdf/1908.01073v2.pdf
PWC	https://paperswithcode.com/paper/u-net-fixed-point-quantization-for-medical
Repo	https://github.com/paraficial/vae_pancreas_segmentation
Framework	pytorch

Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance


Title	Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance
Authors	Zhengyu Zhao, Zhuoran Liu, Martha Larson
Abstract	The success of image perturbations that are designed to fool image classifier is assessed in terms of both adversarial effect and visual imperceptibility. The conventional assumption on imperceptibility is that perturbations should strive for tight $L_p$-norm bounds in RGB space. In this work, we drop this assumption by pursuing an approach that exploits human color perception, and more specifically, minimizing perturbation size with respect to perceptual color distance. Our first approach, Perceptual Color distance C&W (PerC-C&W), extends the widely-used C&W approach and produces larger RGB perturbations. PerC-C&W is able to maintain adversarial strength, while contributing to imperceptibility. Our second approach, Perceptual Color distance Alternating Loss (PerC-AL), achieves the same outcome, but does so more efficiently by alternating between the classification loss and perceptual color difference when updating perturbations. Experimental evaluation shows PerC approaches outperform conventional $L_p$ approaches in terms of robustness and transferability, and also demonstrates that the PerC distance can provide added value on top of existing structure-based methods to creating image perturbations.
Tasks	Image Classification
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02466v2
PDF	https://arxiv.org/pdf/1911.02466v2.pdf
PWC	https://paperswithcode.com/paper/towards-large-yet-imperceptible-adversarial
Repo	https://github.com/ZhengyuZhao/PerC-Adversarial
Framework	pytorch

Light Multi-segment Activation for Model Compression


Title	Light Multi-segment Activation for Model Compression
Authors	Zhenhui Xu, Guolin Ke, Jia Zhang, Jiang Bian, Tie-Yan Liu
Abstract	Model compression has become necessary when applying neural networks (NN) into many real application tasks that can accept slightly-reduced model accuracy with strict tolerance to model complexity. Recently, Knowledge Distillation, which distills the knowledge from well-trained and highly complex teacher model into a compact student model, has been widely used for model compression. However, under the strict requirement on the resource cost, it is quite challenging to achieve comparable performance with the teacher model, essentially due to the drastically-reduced expressiveness ability of the compact student model. Inspired by the nature of the expressiveness ability in Neural Networks, we propose to use multi-segment activation, which can significantly improve the expressiveness ability with very little cost, in the compact student model. Specifically, we propose a highly efficient multi-segment activation, called Light Multi-segment Activation (LMA), which can rapidly produce multiple linear regions with very few parameters by leveraging the statistical information. With using LMA, the compact student model is capable of achieving much better performance effectively and efficiently, than the ReLU-equipped one with same model scale. Furthermore, the proposed method is compatible with other model compression techniques, such as quantization, which means they can be used jointly for better compression performance. Experiments on state-of-the-art NN architectures over the real-world tasks demonstrate the effectiveness and extensibility of the LMA.
Tasks	Model Compression, Quantization
Published	2019-07-16
URL	https://arxiv.org/abs/1907.06870v2
PDF	https://arxiv.org/pdf/1907.06870v2.pdf
PWC	https://paperswithcode.com/paper/light-multi-segment-activation-for-model
Repo	https://github.com/LMA-NeurIPS19/LMA
Framework	pytorch

Learning-Driven Exploration for Reinforcement Learning


Title	Learning-Driven Exploration for Reinforcement Learning
Authors	Muhammad Usama, Dong Eui Chang
Abstract	Deep reinforcement learning algorithms have been shown to learn complex skills using only high-dimensional observations and scalar reward. Effective and intelligent exploration still remains an unresolved problem for reinforcement learning. Most contemporary reinforcement learning relies on simple heuristic strategies such as $\epsilon$-greedy exploration or adding Gaussian noise to actions. These heuristics, however, are unable to intelligently distinguish the well explored and the unexplored regions of the state space, which can lead to inefficient use of training time. We introduce entropy-based exploration (EBE) that enables an agent to explore efficiently the unexplored regions of the state space. EBE quantifies the agent’s learning in a state using merely state dependent action values and adaptively explores the state space, i.e. more exploration for the unexplored region of the state space. We perform experiments on many environments including a simple linear environment, a simpler version of the breakout game and multiple first-person shooter (FPS) games of VizDoom platform. We demonstrate that EBE enables efficient exploration that ultimately results in faster learning without having to tune hyperparameters.
Tasks	Efficient Exploration, FPS Games
Published	2019-06-17
URL	https://arxiv.org/abs/1906.06890v1
PDF	https://arxiv.org/pdf/1906.06890v1.pdf
PWC	https://paperswithcode.com/paper/learning-driven-exploration-for-reinforcement
Repo	https://github.com/Usama1002/EBE-Exploration
Framework	tf

Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing


Title	Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing
Authors	Tal Schuster, Ori Ram, Regina Barzilay, Amir Globerson
Abstract	We introduce a novel method for multilingual transfer that utilizes deep contextual embeddings, pretrained in an unsupervised fashion. While contextual embeddings have been shown to yield richer representations of meaning compared to their static counterparts, aligning them poses a challenge due to their dynamic nature. To this end, we construct context-independent variants of the original monolingual spaces and utilize their mapping to derive an alignment for the context-dependent spaces. This mapping readily supports processing of a target language, improving transfer by context-aware embeddings. Our experimental results demonstrate the effectiveness of this approach for zero-shot and few-shot learning of dependency parsing. Specifically, our method consistently outperforms the previous state-of-the-art on 6 tested languages, yielding an improvement of 6.8 LAS points on average.
Tasks	Dependency Parsing, Few-Shot Learning, Word Embeddings
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09492v2
PDF	http://arxiv.org/pdf/1902.09492v2.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-alignment-of-contextual-word
Repo	https://github.com/TalSchuster/CrossLingualELMo
Framework	pytorch

Variational Representation Learning for Vehicle Re-Identification


Title	Variational Representation Learning for Vehicle Re-Identification
Authors	Saghir Ahmed Saghir Alfasly, Yongjian Hu, Tiancai Liang, Xiaofeng Jin, Qingli Zhao, Beibei Liu
Abstract	Vehicle Re-identification is attracting more and more attention in recent years. One of the most challenging problems is to learn an efficient representation for a vehicle from its multi-viewpoint images. Existing methods tend to derive features of dimensions ranging from thousands to tens of thousands. In this work we proposed a deep learning based framework that can lead to an efficient representation of vehicles. While the dimension of the learned features can be as low as 256, experiments on different datasets show that the Top-1 and Top-5 retrieval accuracies exceed multiple state-of-the-art methods. The key to our framework is two-fold. Firstly, variational feature learning is employed to generate variational features which are more discriminating. Secondly, long short-term memory (LSTM) is used to learn the relationship among different viewpoints of a vehicle. The LSTM also plays as an encoder to downsize the features.
Tasks	Representation Learning, Vehicle Re-Identification
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02343v1
PDF	https://arxiv.org/pdf/1905.02343v1.pdf
PWC	https://paperswithcode.com/paper/variational-representation-learning-for
Repo	https://github.com/saghiralfasly/VFL-Vehicle-Re-Id
Framework	tf

SensitiveNets: Learning Agnostic Representations with Application to Face Recognition


Title	SensitiveNets: Learning Agnostic Representations with Application to Face Recognition
Authors	Aythami Morales, Julian Fierrez, Ruben Vera-Rodriguez
Abstract	This work proposes a new neural network feature representation that help to leave out sensitive information in the decision-making process of pattern recognition and machine learning algorithms. The aim of this work is to develop a learning method capable to remove certain information from the feature space without drop of performance in a recognition task based on that feature space. Our work is in part motivated by the new international regulation for personal data protection, which forces data controllers to avoid discriminative hazards while managing sensitive data of users. Our method is based on a triplet loss learning generalization that introduces a sensitive information removal process. The method is evaluated on face recognition technologies using state-of-the-art algorithms and publicly available benchmarks. In addition, we present a new annotation dataset with balanced distribution between genders and ethnic origins. The dataset includes more than 120K images from 24K identities with variety of poses, image quality, facial expressions, and illumination. The experiments demonstrate that it is possible to reduce sensitive information such as gender or ethnicity in the feature representation while retaining competitive performance in a face recognition task.
Tasks	Face Recognition, Representation Learning
Published	2019-02-01
URL	http://arxiv.org/abs/1902.00334v1
PDF	http://arxiv.org/pdf/1902.00334v1.pdf
PWC	https://paperswithcode.com/paper/sensitivenets-learning-agnostic
Repo	https://github.com/BiDAlab/DiveFace
Framework	none