January 30, 2020

3384 words 16 mins read

Paper Group ANR 280

A Benchmark on Tricks for Large-scale Image Retrieval. Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery. What Question Answering can Learn from Trivia Nerds. Support and Invertibility in Domain-Invariant Representations. Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems. Hybri …

A Benchmark on Tricks for Large-scale Image Retrieval


Title	A Benchmark on Tricks for Large-scale Image Retrieval
Authors	ByungSoo Ko, Minchul Shin, Geonmo Gu, HeeJae Jun, Tae Kwan Lee, Youngjoon Kim
Abstract	Many studies have been performed on metric learning, which has become a key ingredient in top-performing methods of instance-level image retrieval. Meanwhile, less attention has been paid to pre-processing and post-processing tricks that can significantly boost performance. Furthermore, we found that most previous studies used small scale datasets to simplify processing. Because the behavior of a feature representation in a deep learning model depends on both domain and data, it is important to understand how model behave in large-scale environments when a proper combination of retrieval tricks is used. In this paper, we extensively analyze the effect of well-known pre-processing, post-processing tricks, and their combination for large-scale image retrieval. We found that proper use of these tricks can significantly improve model performance without necessitating complex architecture or introducing loss, as confirmed by achieving a competitive result on the Google Landmark Retrieval Challenge 2019.
Tasks	Image Retrieval, Metric Learning
Published	2019-07-27
URL	https://arxiv.org/abs/1907.11854v1
PDF	https://arxiv.org/pdf/1907.11854v1.pdf
PWC	https://paperswithcode.com/paper/a-benchmark-on-tricks-for-large-scale-image
Repo
Framework

Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery


Title	Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery
Authors	Xianzhen Li, Zhao Zhang, Yang Wang, Guangcan Liu, Shuicheng Yan, Meng Wang
Abstract	For subspace recovery, most existing low-rank representation (LRR) models performs in the original space in single-layer mode. As such, the deep hierarchical information cannot be learned, which may result in inaccurate recoveries for complex real data. In this paper, we explore the deep multi-subspace recovery problem by designing a multilayer architecture for latent LRR. Technically, we propose a new Multilayer Collabora-tive Low-Rank Representation Network model termed DeepLRR to discover deep features and deep subspaces. In each layer (>2), DeepLRR bilinearly reconstructs the data matrix by the collabo-rative representation with low-rank coefficients and projection matrices in the previous layer. The bilinear low-rank reconstruc-tion of previous layer is directly fed into the next layer as the input and low-rank dictionary for representation learning, and is further decomposed into a deep principal feature part, a deep salient feature part and a deep sparse error. As such, the coher-ence issue can be also resolved due to the low-rank dictionary, and the robustness against noise can also be enhanced in the feature subspace. To recover the sparse errors in layers accurately, a dynamic growing strategy is used, as the noise level will be-come smaller for the increase of layers. Besides, a neighborhood reconstruction error is also included to encode the locality of deep salient features by deep coefficients adaptively in each layer. Extensive results on public databases show that our DeepLRR outperforms other related models for subspace discovery and clustering.
Tasks	Representation Learning
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06450v3
PDF	https://arxiv.org/pdf/1912.06450v3.pdf
PWC	https://paperswithcode.com/paper/multilayer-collaborative-low-rank-coding
Repo
Framework

What Question Answering can Learn from Trivia Nerds


Title	What Question Answering can Learn from Trivia Nerds
Authors	Jordan Boyd-Graber
Abstract	In addition to the traditional task of getting machines to answer questions, a major research question in question answering is to create interesting, challenging questions that can help systems learn how to answer questions and also reveal which systems are the best at answering questions. We argue that creating a question answering dataset—and the ubiquitous leaderboard that goes with it—closely resembles running a trivia tournament: you write questions, have agents (either humans or machines) answer the questions, and declare a winner. However, the research community has ignored the decades of hard-learned lessons from decades of the trivia community creating vibrant, fair, and effective question answering competitions. After detailing problems with existing QA datasets, we outline the key lessons—removing ambiguity, discriminating skill, and adjudicating disputes—that can transfer to QA research and how they might be implemented for the QA community.
Tasks	Question Answering
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14464v2
PDF	https://arxiv.org/pdf/1910.14464v2.pdf
PWC	https://paperswithcode.com/paper/what-question-answering-can-learn-from-trivia
Repo
Framework

Support and Invertibility in Domain-Invariant Representations


Title	Support and Invertibility in Domain-Invariant Representations
Authors	Fredrik D. Johansson, David Sontag, Rajesh Ranganath
Abstract	Learning domain-invariant representations has become a popular approach to unsupervised domain adaptation and is often justified by invoking a particular suite of theoretical results. We argue that there are two significant flaws in such arguments. First, the results in question hold only for a fixed representation and do not account for information lost in non-invertible transformations. Second, domain invariance is often a far too strict requirement and does not always lead to consistent estimation, even under strong and favorable assumptions. In this work, we give generalization bounds for unsupervised domain adaptation that hold for any representation function by acknowledging the cost of non-invertibility. In addition, we show that penalizing distance between densities is often wasteful and propose a bound based on measuring the extent to which the support of the source domain covers the target domain. We perform experiments on well-known benchmarks that illustrate the short-comings of current standard practice.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2019-03-08
URL	https://arxiv.org/abs/1903.03448v4
PDF	https://arxiv.org/pdf/1903.03448v4.pdf
PWC	https://paperswithcode.com/paper/support-and-invertibility-in-domain-invariant
Repo
Framework

Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems


Title	Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems
Authors	Qiang Huang, Jianhui Bu, Weijian Xie, Shengwen Yang, Weijia Wu, Liping Liu
Abstract	Question Answering (QA) systems are used to provide proper responses to users’ questions automatically. Sentence matching is an essential task in the QA systems and is usually reformulated as a Paraphrase Identification (PI) problem. Given a question, the aim of the task is to find the most similar question from a QA knowledge base. In this paper, we propose a Multi-task Sentence Encoding Model (MSEM) for the PI problem, wherein a connected graph is employed to depict the relation between sentences, and a multi-task learning model is applied to address both the sentence matching and sentence intent classification problem. In addition, we implement a general semantic retrieval framework that combines our proposed model and the Approximate Nearest Neighbor (ANN) technology, which enables us to find the most similar question from all available candidates very quickly during online serving. The experiments show the superiority of our proposed method as compared with the existing sentence matching models.
Tasks	Intent Classification, Multi-Task Learning, Paraphrase Identification, Question Answering
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07405v1
PDF	https://arxiv.org/pdf/1911.07405v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-sentence-encoding-model-for
Repo
Framework

Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries


Title	Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
Authors	Jawadul H. Bappy, Cody Simons, Lakshmanan Nataraj, B. S. Manjunath, Amit K. Roy-Chowdhury
Abstract	With advanced image journaling tools, one can easily alter the semantic meaning of an image by exploiting certain manipulation techniques such as copy-clone, object splicing, and removal, which mislead the viewers. In contrast, the identification of these manipulations becomes a very challenging task as manipulated regions are not visually apparent. This paper proposes a high-confidence manipulation localization architecture which utilizes resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder network to segment out manipulated regions from non-manipulated ones. Resampling features are used to capture artifacts like JPEG quality loss, upsampling, downsampling, rotation, and shearing. The proposed network exploits larger receptive fields (spatial maps) and frequency domain correlation to analyze the discriminative characteristics between manipulated and non-manipulated regions by incorporating encoder and LSTM network. Finally, decoder network learns the mapping from low-resolution feature maps to pixel-wise predictions for image tamper localization. With predicted mask provided by final layer (softmax) of the proposed architecture, end-to-end training is performed to learn the network parameters through back-propagation using ground-truth masks. Furthermore, a large image splicing dataset is introduced to guide the training process. The proposed method is capable of localizing image manipulations at pixel level with high precision, which is demonstrated through rigorous experimentation on three diverse datasets.
Tasks
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02495v1
PDF	http://arxiv.org/pdf/1903.02495v1.pdf
PWC	https://paperswithcode.com/paper/hybrid-lstm-and-encoder-decoder-architecture
Repo
Framework

Semi-supervised Feature-Level Attribute Manipulation for Fashion Image Retrieval


Title	Semi-supervised Feature-Level Attribute Manipulation for Fashion Image Retrieval
Authors	Minchul Shin, Sanghyuk Park, Taeksoo Kim
Abstract	With a growing demand for the search by image, many works have studied the task of fashion instance-level image retrieval (FIR). Furthermore, the recent works introduce a concept of fashion attribute manipulation (FAM) which manipulates a specific attribute (e.g color) of a fashion item while maintaining the rest of the attributes (e.g shape, and pattern). In this way, users can search not only “the same” items but also “similar” items with the desired attributes. FAM is a challenging task in that the attributes are hard to define, and the unique characteristics of a query are hard to be preserved. Although both FIR and FAM are important in real-life applications, most of the previous studies have focused on only one of these problem. In this study, we aim to achieve competitive performance on both FIR and FAM. To do so, we propose a novel method that converts a query into a representation with the desired attributes. We introduce a new idea of attribute manipulation at the feature level, by matching the distribution of manipulated features with real features. In this fashion, the attribute manipulation can be done independently from learning a representation from the image. By introducing the feature-level attribute manipulation, the previous methods for FIR can perform attribute manipulation without sacrificing their retrieval performance.
Tasks	Image Retrieval
Published	2019-07-11
URL	https://arxiv.org/abs/1907.05007v1
PDF	https://arxiv.org/pdf/1907.05007v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-feature-level-attribute
Repo
Framework

Global-Local Metamodel Assisted Two-Stage Optimization via Simulation


Title	Global-Local Metamodel Assisted Two-Stage Optimization via Simulation
Authors	Wei Xie, Yuan Yi, Hua Zheng
Abstract	To integrate strategic, tactical and operational decisions, the two-stage optimization has been widely used to guide dynamic decision making. In this paper, we study the two-stage stochastic programming for complex systems with unknown response estimated by simulation. We introduce the global-local metamodel assisted two-stage optimization via simulation that can efficiently employ the simulation resource to iteratively solve for the optimal first- and second-stage decisions. Specifically, at each visited first-stage decision, we develop a local metamodel to simultaneously solve a set of scenario-based second-stage optimization problems, which also allows us to estimate the optimality gap. Then, we construct a global metamodel accounting for the errors induced by: (1) using a finite number of scenarios to approximate the expected future cost occurring in the planning horizon, (2) second-stage optimality gap, and (3) finite visited first-stage decisions. Assisted by the global-local metamodel, we propose a new simulation optimization approach that can efficiently and iteratively search for the optimal first- and second-stage decisions. Our framework can guarantee the convergence of optimal solution for the discrete two-stage optimization with unknown objective, and the empirical study indicates that it achieves substantial efficiency and accuracy.
Tasks	Decision Making
Published	2019-10-13
URL	https://arxiv.org/abs/1910.05863v1
PDF	https://arxiv.org/pdf/1910.05863v1.pdf
PWC	https://paperswithcode.com/paper/global-local-metamodel-assisted-two-stage
Repo
Framework

Generative One-Shot Face Recognition


Title	Generative One-Shot Face Recognition
Authors	Zhengming Ding, Yandong Guo, Lei Zhang, Yun Fu
Abstract	One-shot face recognition measures the ability to identify persons with only seeing them at one glance, and is a hallmark of human visual intelligence. It is challenging for conventional machine learning approaches to mimic this way, since limited data are hard to effectively represent the data variance. The goal of one-shot face recognition is to learn a large-scale face recognizer, which is capable to fight off the data imbalance challenge. In this paper, we propose a novel generative adversarial one-shot face recognizer, attempting to synthesize meaningful data for one-shot classes by adapting the data variances from other normal classes. Specifically, we target at building a more effective general face classifier for both normal persons and one-shot persons. Technically, we design a new loss function by formulating knowledge transfer generator and a general classifier into a unified framework. Such a two-player minimax optimization can guide the generation of more effective data, which effectively promote the underrepresented classes in the learned model and lead to a remarkable improvement in face recognition performance. We evaluate our proposed model on the MS-Celeb-1M one-shot learning benchmark task, where we could recognize 94.98% of the test images at the precision of 99% for the one-shot classes, keeping an overall Top1 accuracy at $99.80%$ for the normal classes. To the best of our knowledge, this is the best performance among all the published methods using this benchmark task with the same setup, including all the participants in the recent MS-Celeb-1M challenge at ICCV 2017\footnote{http://www.msceleb.org/challenge2/2017}.
Tasks	Face Recognition, One-Shot Learning, Transfer Learning
Published	2019-09-28
URL	https://arxiv.org/abs/1910.04860v1
PDF	https://arxiv.org/pdf/1910.04860v1.pdf
PWC	https://paperswithcode.com/paper/generative-one-shot-face-recognition
Repo
Framework

Learning from Indirect Observations


Title	Learning from Indirect Observations
Authors	Yivan Zhang, Nontawat Charoenphakdee, Masashi Sugiyama
Abstract	Weakly-supervised learning is a paradigm for alleviating the scarcity of labeled data by leveraging lower-quality but larger-scale supervision signals. While existing work mainly focuses on utilizing a certain type of weak supervision, we present a probabilistic framework, learning from indirect observations, for learning from a wide range of weak supervision in real-world problems, e.g., noisy labels, complementary labels and coarse-grained labels. We propose a general method based on the maximum likelihood principle, which has desirable theoretical properties and can be straightforwardly implemented for deep neural networks. Concretely, a discriminative model for the true target is used for modeling the indirect observation, which is a random variable entirely depending on the true target stochastically or deterministically. Then, maximizing the likelihood given indirect observations leads to an estimator of the true target implicitly. Comprehensive experiments for two novel problem settings — learning from multiclass label proportions and learning from coarse-grained labels, illustrate practical usefulness of our method and demonstrate how to integrate various sources of weak supervision.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04394v1
PDF	https://arxiv.org/pdf/1910.04394v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-indirect-observations
Repo
Framework

Learning walk and trot from the same objective using different types of exploration


Title	Learning walk and trot from the same objective using different types of exploration
Authors	Zinan Liu, Kai Ploeger, Svenja Stark, Elmar Rueckert, Jan Peters
Abstract	In quadruped gait learning, policy search methods that scale high dimensional continuous action spaces are commonly used. In most approaches, it is necessary to introduce prior knowledge on the gaits to limit the highly non-convex search space of the policies. In this work, we propose a new approach to encode the symmetry properties of the desired gaits, on the initial covariance of the Gaussian search distribution, allowing for strategic exploration. Using episode-based likelihood ratio policy gradient and relative entropy policy search, we learned the gaits walk and trot on a simulated quadruped. Comparing these gaits to random gaits learned by initialized diagonal covariance matrix, we show that the performance can be significantly enhanced.
Tasks
Published	2019-04-28
URL	http://arxiv.org/abs/1904.12336v1
PDF	http://arxiv.org/pdf/1904.12336v1.pdf
PWC	https://paperswithcode.com/paper/learning-walk-and-trot-from-the-same
Repo
Framework

Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms


Title	Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms
Authors	Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve Gurel, Bo Li, Ce Zhang, Costas J. Spanos, Dawn Song
Abstract	Given a data set $\mathcal{D}$ containing millions of data points and a data consumer who is willing to pay for $$X$ to train a machine learning (ML) model over $\mathcal{D}$, how should we distribute this $$X$ to each data point to reflect its “value”? In this paper, we define the “relative value of data” via the Shapley value, as it uniquely possesses properties with appealing real-world interpretations, such as fairness, rationality and decentralizability. For general, bounded utility functions, the Shapley value is known to be challenging to compute: to get Shapley values for all $N$ data points, it requires $O(2^N)$ model evaluations for exact computation and $O(N\log N)$ for $(\epsilon, \delta)$-approximation. In this paper, we focus on one popular family of ML models relying on $K$-nearest neighbors ($K$NN). The most surprising result is that for unweighted $K$NN classifiers and regressors, the Shapley value of all $N$ data points can be computed, exactly, in $O(N\log N)$ time – an exponential improvement on computational complexity! Moreover, for $(\epsilon, \delta)$-approximation, we are able to develop an algorithm based on Locality Sensitive Hashing (LSH) with only sublinear complexity $O(N^{h(\epsilon,K)}\log N)$ when $\epsilon$ is not too small and $K$ is not too large. We empirically evaluate our algorithms on up to $10$ million data points and even our exact algorithm is up to three orders of magnitude faster than the baseline approximation algorithm. The LSH-based approximation algorithm can accelerate the value calculation process even further. We then extend our algorithms to other scenarios such as (1) weighed $K$NN classifiers, (2) different data points are clustered by different data curators, and (3) there are data analysts providing computation who also requires proper valuation.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08619v4
PDF	https://arxiv.org/pdf/1908.08619v4.pdf
PWC	https://paperswithcode.com/paper/efficient-task-specific-data-valuation-for
Repo
Framework

A Causal Bayesian Networks Viewpoint on Fairness


Title	A Causal Bayesian Networks Viewpoint on Fairness
Authors	Silvia Chiappa, William S. Isaac
Abstract	We offer a graphical interpretation of unfairness in a dataset as the presence of an unfair causal path in the causal Bayesian network representing the data-generation mechanism. We use this viewpoint to revisit the recent debate surrounding the COMPAS pretrial risk assessment tool and, more generally, to point out that fairness evaluation on a model requires careful considerations on the patterns of unfairness underlying the training data. We show that causal Bayesian networks provide us with a powerful tool to measure unfairness in a dataset and to design fair models in complex unfairness scenarios.
Tasks
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06430v1
PDF	https://arxiv.org/pdf/1907.06430v1.pdf
PWC	https://paperswithcode.com/paper/a-causal-bayesian-networks-viewpoint-on
Repo
Framework

Defective samples simulation through Neural Style Transfer for automatic surface defect segment


Title	Defective samples simulation through Neural Style Transfer for automatic surface defect segment
Authors	Taoran Wei, Danhua Cao, Xingru Jiang, Caiyun Zheng, Lizhe Liu
Abstract	Owing to the lack of defect samples in industrial product quality inspection, trained segmentation model tends to overfit when applied online. To address this problem, we propose a defect sample simulation algorithm based on neural style transfer. The simulation algorithm requires only a small number of defect samples for training, and can efficiently generate simulation samples for next-step segmentation task. In our work, we introduce a masked histogram matching module to maintain color consistency of the generated area and the true defect. To preserve the texture consistency with the surrounding pixels, we take the fast style transfer algorithm to blend the generated area into the background. At the same time, we also use the histogram loss to further improve the quality of the generated image. Besides, we propose a novel structure of segment net to make it more suitable for defect segmentation task. We train the segment net with the real defect samples and the generated simulation samples separately on the button datasets. The results show that the F1 score of the model trained with only the generated simulation samples reaches 0.80, which is better than the real sample result.
Tasks	Style Transfer
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03334v1
PDF	https://arxiv.org/pdf/1910.03334v1.pdf
PWC	https://paperswithcode.com/paper/defective-samples-simulation-through-neural
Repo
Framework

Linearly Constrained Smoothing Group Sparsity Solvers in Off-grid Model


Title	Linearly Constrained Smoothing Group Sparsity Solvers in Off-grid Model
Authors	Cheng-Yu Hung, Mostafa Kaveh
Abstract	In compressed sensing, the sensing matrix is assumed perfectly known. However, there exists perturbation in the sensing matrix in reality due to sensor offsets or noise disturbance. Directions-of-arrival (DoA) estimation with off-grid effect satisfies this situation, and can be formulated into a (non)convex optimization problem with linear inequalities constraints, which can be solved by the interior point method (using the CVX tools), but at a large computational cost. In this work, in order to design efficient algorithms, we consider various alternative formulations, such as unconstrained formulation, primal-dual formulation, or conic formulation to develop group-sparsity promoted solvers. First, the consensus alternating direction method of multipliers (C-ADMM) is applied. Then, iterative algorithms for the BPDN formulation is proposed by combining the Nesterov smoothing technique with accelerated proximal gradient method, and the convergence analysis of the method is conducted as well. We also developed a variant of EGT (Excessive Gap Technique)-based primal-dual method to systematically reduce the smoothing parameter sequentially. Finally, we propose algorithms for quadratically constrained L2-L1 mixed norm minimization problem by using the smoothed dual conic optimization (SDCO) and continuation technique. The performance of accuracy and convergence for all the proposed methods are demonstrated in the numerical simulations.
Tasks
Published	2019-03-17
URL	https://arxiv.org/abs/1903.07164v2
PDF	https://arxiv.org/pdf/1903.07164v2.pdf
PWC	https://paperswithcode.com/paper/linearly-constrained-smoothing-group-sparsity
Repo
Framework