January 27, 2020

3481 words 17 mins read

Paper Group ANR 1303

Blockwise Based Detection of Local Defects. Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections. Deep Optimization model for Screen Content Image Quality Assessment using Neural Networks. Semi-Supervised Exploration in Image Retrieval. Deep Learning for Hybrid 5G Services in Mobile Edge Computing Systems: Learn fr …

Blockwise Based Detection of Local Defects


Title	Blockwise Based Detection of Local Defects
Authors	Xiaoyu Xiang, Renee Jessome, Eric Maggard, Yousun Bang, Minki Cho, Jan Allebach
Abstract	Print quality is an important criterion for a printer’s performance. The detection, classification, and assessment of printing defects can reflect the printer’s working status and help to locate mechanical problems inside. To handle all these questions, an efficient algorithm is needed to replace the traditionally visual checking method. In this paper, we focus on pages with local defects including gray spots and solid spots. We propose a coarse-to-fine method to detect local defects in a block-wise manner, and aggregate the blockwise attributes to generate the feature vector of the whole test page for a further ranking task. In the detection part, we first select candidate regions by thresholding a single feature. Then more detailed features of candidate blocks are calculated and sent to a decision tree that is previously trained on our training dataset. The final result is given by the decision tree model to control the false alarm rate while maintaining the required miss rate. Our algorithm is proved to be effective in detecting and classifying local defects compared with previous methods.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02374v1
PDF	https://arxiv.org/pdf/1906.02374v1.pdf
PWC	https://paperswithcode.com/paper/blockwise-based-detection-of-local-defects
Repo
Framework

Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections


Title	Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections
Authors	Theodora Kontogianni, Michael Gygli, Jasper Uijlings, Vittorio Ferrari
Abstract	In interactive object segmentation a user collaborates with a computer vision model to segment an object. Recent works rely on convolutional neural networks to predict the segmentation, taking the image and the corrections made by the user as input. By training on large datasets they offer strong performance, but they keep model parameters fixed at test time. Instead, we treat user corrections as training examples to update our model on-the-fly to the data at hand. This enables it to successfully adapt to the appearance of a particular test image, to distributions shifts in the whole test set, and even to large domain changes, where the imaging modality changes between training and testing. We extensively evaluate our method on 8 diverse datasets and improve over a fixed model on all of them. Our method shows the most dramatic improvements when training and testing domains differ, where it produces segmentation masks of the desired quality from 60-70% less user input. Furthermore we achieve state-of-the-art on four standard interactive segmentation datasets: PASCAL VOC12, GrabCut, DAVIS16 and Berkeley.
Tasks	Interactive Segmentation, Semantic Segmentation
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12709v1
PDF	https://arxiv.org/pdf/1911.12709v1.pdf
PWC	https://paperswithcode.com/paper/continuous-adaptation-for-interactive-object
Repo
Framework

Deep Optimization model for Screen Content Image Quality Assessment using Neural Networks


Title	Deep Optimization model for Screen Content Image Quality Assessment using Neural Networks
Authors	Xuhao Jiang, Liquan Shen, Guorui Feng, Liangwei Yu, Ping An
Abstract	In this paper, we propose a novel quadratic optimized model based on the deep convolutional neural network (QODCNN) for full-reference and no-reference screen content image (SCI) quality assessment. Unlike traditional CNN methods taking all image patches as training data and using average quality pooling, our model is optimized to obtain a more effective model including three steps. In the first step, an end-to-end deep CNN is trained to preliminarily predict the image visual quality, and batch normalized (BN) layers and l2 regularization are employed to improve the speed and performance of network fitting. For second step, the pretrained model is fine-tuned to achieve better performance under analysis of the raw training data. An adaptive weighting method is proposed in the third step to fuse local quality inspired by the perceptual property of the human visual system (HVS) that the HVS is sensitive to image patches containing texture and edge information. The novelty of our algorithm can be concluded as follows: 1) with the consideration of correlation between local quality and subjective differential mean opinion score (DMOS), the Euclidean distance is utilized to measure effectiveness of image patches, and the pretrained model is fine-tuned with more effective training data; 2) an adaptive pooling approach is employed to fuse patch quality of textual and pictorial regions, whose feature only extracted from distorted images owns strong noise robust and effects on both FR and NR IQA; 3) Considering the characteristics of SCIs, a deep and valid network architecture is designed for both NR and FR visual quality evaluation of SCIs. Experimental results verify that our model outperforms both current no-reference and full-reference image quality assessment methods on the benchmark screen content image quality assessment database (SIQAD).
Tasks	Image Quality Assessment, L2 Regularization
Published	2019-03-02
URL	http://arxiv.org/abs/1903.00705v1
PDF	http://arxiv.org/pdf/1903.00705v1.pdf
PWC	https://paperswithcode.com/paper/deep-optimization-model-for-screen-content
Repo
Framework

Semi-Supervised Exploration in Image Retrieval


Title	Semi-Supervised Exploration in Image Retrieval
Authors	Cheng Chang, Himanshu Rai, Satya Krishna Gorti, Junwei Ma, Chundi Liu, Guangwei Yu, Maksims Volkovs
Abstract	We present our solution to Landmark Image Retrieval Challenge 2019. This challenge was based on the large Google Landmarks Dataset V2[9]. The goal was to retrieve all database images containing the same landmark for every provided query image. Our solution is a combination of global and local models to form an initial KNN graph. We then use a novel extension of the recently proposed graph traversal method EGT [1] referred to as semi-supervised EGT to refine the graph and retrieve better candidates.
Tasks	Image Retrieval
Published	2019-06-12
URL	https://arxiv.org/abs/1906.04944v1
PDF	https://arxiv.org/pdf/1906.04944v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-exploration-in-image
Repo
Framework

Deep Learning for Hybrid 5G Services in Mobile Edge Computing Systems: Learn from a Digital Twin


Title	Deep Learning for Hybrid 5G Services in Mobile Edge Computing Systems: Learn from a Digital Twin
Authors	Rui Dong, Changyang She, Wibowo Hardjawana, Yonghui Li, Branka Vucetic
Abstract	In this work, we consider a mobile edge computing system with both ultra-reliable and low-latency communications services and delay tolerant services. We aim to minimize the normalized energy consumption, defined as the energy consumption per bit, by optimizing user association, resource allocation, and offloading probabilities subject to the quality-of-service requirements. The user association is managed by the mobility management entity (MME), while resource allocation and offloading probabilities are determined by each access point (AP). We propose a deep learning (DL) architecture, where a digital twin of the real network environment is used to train the DL algorithm off-line at a central server. From the pre-trained deep neural network (DNN), the MME can obtain user association scheme in a real-time manner. Considering that real networks are not static, the digital twin monitors the variation of real networks and updates the DNN accordingly. For a given user association scheme, we propose an optimization algorithm to find the optimal resource allocation and offloading probabilities at each AP. Simulation results show that our method can achieve lower normalized energy consumption with less computation complexity compared with an existing method and approach to the performance of the global optimal solution.
Tasks
Published	2019-06-30
URL	https://arxiv.org/abs/1907.01523v1
PDF	https://arxiv.org/pdf/1907.01523v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-hybrid-5g-services-in
Repo
Framework

Efficient Codebook and Factorization for Second Order Representation Learning


Title	Efficient Codebook and Factorization for Second Order Representation Learning
Authors	Pierre Jacob, David Picard, Aymeric Histace, Edouard Klein
Abstract	Learning rich and compact representations is an open topic in many fields such as object recognition or image retrieval. Deep neural networks have made a major breakthrough during the last few years for these tasks but their representations are not necessary as rich as needed nor as compact as expected. To build richer representations, high order statistics have been exploited and have shown excellent performances, but they produce higher dimensional features. While this drawback has been partially addressed with factorization schemes, the original compactness of first order models has never been retrieved, or at the cost of a strong performance decrease. Our method, by jointly integrating codebook strategy to factorization scheme, is able to produce compact representations while keeping the second order performances with few additional parameters. This formulation leads to state-of-the-art results on three image retrieval datasets.
Tasks	Image Retrieval, Object Recognition, Representation Learning
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01972v1
PDF	https://arxiv.org/pdf/1906.01972v1.pdf
PWC	https://paperswithcode.com/paper/efficient-codebook-and-factorization-for-1
Repo
Framework

Learning from Interventions using Hierarchical Policies for Safe Learning


Title	Learning from Interventions using Hierarchical Policies for Safe Learning
Authors	Jing Bi, Vikas Dhiman, Tianyou Xiao, Chenliang Xu
Abstract	Learning from Demonstrations (LfD) via Behavior Cloning (BC) works well on multiple complex tasks. However, a limitation of the typical LfD approach is that it requires expert demonstrations for all scenarios, including those in which the algorithm is already well-trained. The recently proposed Learning from Interventions (LfI) overcomes this limitation by using an expert overseer. The expert overseer only intervenes when it suspects that an unsafe action is about to be taken. Although LfI significantly improves over LfD, the state-of-the-art LfI fails to account for delay caused by the expert’s reaction time and only learns short-term behavior. We address these limitations by 1) interpolating the expert’s interventions back in time, and 2) by splitting the policy into two hierarchical levels, one that generates sub-goals for the future and another that generates actions to reach those desired sub-goals. This sub-goal prediction forces the algorithm to learn long-term behavior while also being robust to the expert’s reaction time. Our experiments show that LfI using sub-goals in a hierarchical policy framework trains faster and achieves better asymptotic performance than typical LfD.
Tasks
Published	2019-12-04
URL	https://arxiv.org/abs/1912.02241v1
PDF	https://arxiv.org/pdf/1912.02241v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-interventions-using
Repo
Framework

Progressive Wasserstein Barycenters of Persistence Diagrams


Title	Progressive Wasserstein Barycenters of Persistence Diagrams
Authors	Jules Vidal, Joseph Budin, Julien Tierny
Abstract	This paper presents an efficient algorithm for the progressive approximation of Wasserstein barycenters of persistence diagrams, with applications to the visual analysis of ensemble data. Given a set of scalar fields, our approach enables the computation of a persistence diagram which is representative of the set, and which visually conveys the number, data ranges and saliences of the main features of interest found in the set. Such representative diagrams are obtained by computing explicitly the discrete Wasserstein barycenter of the set of persistence diagrams, a notoriously computationally intensive task. In particular, we revisit efficient algorithms for Wasserstein distance approximation [12,51] to extend previous work on barycenter estimation [94]. We present a new fast algorithm, which progressively approximates the barycenter by iteratively increasing the computation accuracy as well as the number of persistent features in the output diagram. Such a progressivity drastically improves convergence in practice and allows to design an interruptible algorithm, capable of respecting computation time constraints. This enables the approximation of Wasserstein barycenters within interactive times. We present an application to ensemble clustering where we revisit the k-means algorithm to exploit our barycenters and compute, within execution time constraints, meaningful clusters of ensemble data along with their barycenter diagram. Extensive experiments on synthetic and real-life data sets report that our algorithm converges to barycenters that are qualitatively meaningful with regard to the applications, and quantitatively comparable to previous techniques, while offering an order of magnitude speedup when run until convergence (without time constraint). Our algorithm can be trivially parallelized to provide additional speedups in practice on standard workstations. […]
Tasks
Published	2019-07-10
URL	https://arxiv.org/abs/1907.04565v2
PDF	https://arxiv.org/pdf/1907.04565v2.pdf
PWC	https://paperswithcode.com/paper/progressive-wasserstein-barycenters-of
Repo
Framework

End-to-End Resume Parsing and Finding Candidates for a Job Description using BERT


Title	End-to-End Resume Parsing and Finding Candidates for a Job Description using BERT
Authors	Vedant Bhatia, Prateek Rawat, Ajit Kumar, Rajiv Ratn Shah
Abstract	The ever-increasing number of applications to job positions presents a challenge for employers to find suitable candidates manually. We present an end-to-end solution for ranking candidates based on their suitability to a job description. We accomplish this in two stages. First, we build a resume parser which extracts complete information from candidate resumes. This parser is made available to the public in the form of a web application. Second, we use BERT sentence pair classification to perform ranking based on their suitability to the job description. To approximate the job description, we use the description of past job experiences by a candidate as mentioned in his resume. Our dataset comprises resumes in LinkedIn format and general non-LinkedIn formats. We parse the LinkedIn resumes with 100% accuracy and establish a strong baseline of 73% accuracy for candidate suitability.
Tasks
Published	2019-09-30
URL	https://arxiv.org/abs/1910.03089v2
PDF	https://arxiv.org/pdf/1910.03089v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-resume-parsing-and-finding
Repo
Framework

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM


Title	An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM
Authors	Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang
Abstract	The high computation and memory storage of large deep neural networks (DNNs) models pose intensive challenges to the conventional Von-Neumann architecture, incurring substantial data movements in the memory hierarchy. The memristor crossbar array has emerged as a promising solution to mitigate the challenges and enable low-power acceleration of DNNs. Memristor-based weight pruning and weight quantization have been seperately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model. However, there has been no systematic investigation of memristor-based neuromorphic computing (NC) systems considering both weight pruning and weight quantization. In this paper, we propose an unified and systematic memristor-based framework considering both structured weight pruning and weight quantization by incorporating alternating direction method of multipliers (ADMM) into DNNs training. We consider hardware constraints such as crossbar blocks pruning, conductance range, and mismatch between weight value and real devices, to achieve high accuracy and low power and small area footprint. Our framework is mainly integrated by three steps, i.e., memristor-based ADMM regularized optimization, masked mapping and retraining. Experimental results show that our proposed framework achieves 29.81X (20.88X) weight compression ratio, with 98.38% (96.96%) and 98.29% (97.47%) power and area reduction on VGG-16 (ResNet-18) network where only have 0.5% (0.76%) accuracy loss, compared to the original DNN models. We share our models at link http://bit.ly/2Jp5LHJ.
Tasks	Quantization
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11691v1
PDF	https://arxiv.org/pdf/1908.11691v1.pdf
PWC	https://paperswithcode.com/paper/an-ultra-efficient-memristor-based-dnn
Repo
Framework

Session-based Complementary Fashion Recommendations


Title	Session-based Complementary Fashion Recommendations
Authors	Jui-Chieh Wu, José Antonio Sánchez Rodríguez, Humberto Jesús Corona Pampín
Abstract	In modern fashion e-commerce platforms, where customers can browse thousands to millions of products, recommender systems are useful tools to navigate and narrow down the vast assortment. In this scenario, complementary recommendations serve the user need to find items that can be worn together. In this paper, we present a personalized, session-based complementary item recommendation algorithm, ZSF-c, tailored for the fashion usecase. We propose a sampling strategy adopted to build the training set, which is useful when existing user interaction data cannot be directly used due to poor quality or availability. Our proposed approach shows significant improvements in terms of accuracy compared to the collaborative filtering approach, serving complementary item recommendations to our customers at the time of the experiments CF-c. The results show an offline relative uplift of +8.2% in Orders Recall@5, as well as a significant +3.24% increase in the number of purchased products measured in an online A/B test carried out in a fashion e-commerce platform with 28 million active customers.
Tasks	Recommendation Systems
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08327v1
PDF	https://arxiv.org/pdf/1908.08327v1.pdf
PWC	https://paperswithcode.com/paper/session-based-complementary-fashion
Repo
Framework

PAC-GAN: An Effective Pose Augmentation Scheme for Unsupervised Cross-View Person Re-identification


Title	PAC-GAN: An Effective Pose Augmentation Scheme for Unsupervised Cross-View Person Re-identification
Authors	Chengyuan Zhang, Lei Zhu, Shichao Zhang
Abstract	Person re-identification (person Re-Id) aims to retrieve the pedestrian images of a same person that captured by disjoint and non-overlapping cameras. Lots of researchers recently focuse on this hot issue and propose deep learning based methods to enhance the recognition rate in a supervised or unsupervised manner. However, two limitations that cannot be ignored: firstly, compared with other image retrieval benchmarks, the size of existing person Re-Id datasets are far from meeting the requirement, which cannot provide sufficient pedestrian samples for the training of deep model; secondly, the samples in existing datasets do not have sufficient human motions or postures coverage to provide more priori knowledges for learning. In this paper, we introduce a novel unsupervised pose augmentation cross-view person Re-Id scheme called PAC-GAN to overcome these limitations. We firstly present the formal definition of cross-view pose augmentation and then propose the framework of PAC-GAN that is a novel conditional generative adversarial network (CGAN) based approach to improve the performance of unsupervised corss-view person Re-Id. Specifically, The pose generation model in PAC-GAN called CPG-Net is to generate enough quantity of pose-rich samples from original image and skeleton samples. The pose augmentation dataset is produced by combining the synthesized pose-rich samples with the original samples, which is fed into the corss-view person Re-Id model named Cross-GAN. Besides, we use weight-sharing strategy in the CPG-Net to improve the quality of new generated samples. To the best of our knowledge, we are the first try to enhance the unsupervised cross-view person Re-Id by pose augmentation, and the results of extensive experiments show that the proposed scheme can combat the state-of-the-arts.
Tasks	Cross-Modal Person Re-Identification, Image Retrieval, Person Re-Identification
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01792v1
PDF	https://arxiv.org/pdf/1906.01792v1.pdf
PWC	https://paperswithcode.com/paper/pac-gan-an-effective-pose-augmentation-scheme
Repo
Framework

Learning with Batch-wise Optimal Transport Loss for 3D Shape Recognition


Title	Learning with Batch-wise Optimal Transport Loss for 3D Shape Recognition
Authors	Lin Xu, Han Sun, Yuai Liu
Abstract	Deep metric learning is essential for visual recognition. The widely used pair-wise (or triplet) based loss objectives cannot make full use of semantical information in training samples or give enough attention to those hard samples during optimization. Thus, they often suffer from a slow convergence rate and inferior performance. In this paper, we show how to learn an importance-driven distance metric via optimal transport programming from batches of samples. It can automatically emphasize hard examples and lead to significant improvements in convergence. We propose a new batch-wise optimal transport loss and combine it in an end-to-end deep metric learning manner. We use it to learn the distance metric and deep feature representation jointly for recognition. Empirical results on visual retrieval and classification tasks with six benchmark datasets, i.e., MNIST, CIFAR10, SHREC13, SHREC14, ModelNet10, and ModelNet40, demonstrate the superiority of the proposed method. It can accelerate the convergence rate significantly while achieving a state-of-the-art recognition performance. For example, in 3D shape recognition experiments, we show that our method can achieve better recognition performance within only 5 epochs than what can be obtained by mainstream 3D shape recognition approaches after 200 epochs.
Tasks	3D Shape Recognition, Metric Learning
Published	2019-03-21
URL	http://arxiv.org/abs/1903.08923v1
PDF	http://arxiv.org/pdf/1903.08923v1.pdf
PWC	https://paperswithcode.com/paper/learning-with-batch-wise-optimal-transport
Repo
Framework

Hyper-parameter Tuning under a Budget Constraint


Title	Hyper-parameter Tuning under a Budget Constraint
Authors	Zhiyun Lu, Chao-Kai Chiang, Fei Sha
Abstract	We study a budgeted hyper-parameter tuning problem, where we optimize the tuning result under a hard resource constraint. We propose to solve it as a sequential decision making problem, such that we can use the partial training progress of configurations to dynamically allocate the remaining budget. Our algorithm combines a Bayesian belief model which estimates the future performance of configurations, with an action-value function which balances exploration-exploitation tradeoff, to optimize the final output. It automatically adapts the tuning behaviors to different constraints, which is useful in practice. Experiment results demonstrate superior performance over existing algorithms, including the-state-of-the-art one, on real-world tuning tasks across a range of different budgets.
Tasks	Decision Making
Published	2019-02-01
URL	http://arxiv.org/abs/1902.00532v1
PDF	http://arxiv.org/pdf/1902.00532v1.pdf
PWC	https://paperswithcode.com/paper/hyper-parameter-tuning-under-a-budget
Repo
Framework

Towards Understanding Residual and Dilated Dense Neural Networks via Convolutional Sparse Coding


Title	Towards Understanding Residual and Dilated Dense Neural Networks via Convolutional Sparse Coding
Authors	Zhiyang Zhang, Shihua Zhang
Abstract	Convolutional neural network (CNN) and its variants have led to many state-of-art results in various fields. However, a clear theoretical understanding about them is still lacking. Recently, multi-layer convolutional sparse coding (ML-CSC) has been proposed and proved to equal such simply stacked networks (plain networks). Here, we think three factors in each layer of it including the initialization, the dictionary design and the number of iterations greatly affect the performance of ML-CSC. Inspired by these considerations, we propose two novel multi-layer models–residual convolutional sparse coding model (Res-CSC) and mixed-scale dense convolutional sparse coding model (MSD-CSC), which have close relationship with the residual neural network (ResNet) and mixed-scale (dilated) dense neural network (MSDNet), respectively. Mathematically, we derive the shortcut connection in ResNet as a special case of a new forward propagation rule on ML-CSC. We find a theoretical interpretation of the dilated convolution and dense connection in MSDNet by analyzing MSD-CSC, which gives a clear mathematical understanding about them. We implement the iterative soft thresholding algorithm (ISTA) and its fast version to solve Res-CSC and MSD-CSC, which can employ the unfolding operation for further improvements. At last, extensive numerical experiments and comparison with competing methods demonstrate their effectiveness using three typical datasets.
Tasks
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02605v2
PDF	https://arxiv.org/pdf/1912.02605v2.pdf
PWC	https://paperswithcode.com/paper/towards-understanding-residual-and-dilated
Repo
Framework