Paper Group ANR 280
A Benchmark on Tricks for Large-scale Image Retrieval. Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery. What Question Answering can Learn from Trivia Nerds. Support and Invertibility in Domain-Invariant Representations. Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems. Hybri …
A Benchmark on Tricks for Large-scale Image Retrieval
Title | A Benchmark on Tricks for Large-scale Image Retrieval |
Authors | ByungSoo Ko, Minchul Shin, Geonmo Gu, HeeJae Jun, Tae Kwan Lee, Youngjoon Kim |
Abstract | Many studies have been performed on metric learning, which has become a key ingredient in top-performing methods of instance-level image retrieval. Meanwhile, less attention has been paid to pre-processing and post-processing tricks that can significantly boost performance. Furthermore, we found that most previous studies used small scale datasets to simplify processing. Because the behavior of a feature representation in a deep learning model depends on both domain and data, it is important to understand how model behave in large-scale environments when a proper combination of retrieval tricks is used. In this paper, we extensively analyze the effect of well-known pre-processing, post-processing tricks, and their combination for large-scale image retrieval. We found that proper use of these tricks can significantly improve model performance without necessitating complex architecture or introducing loss, as confirmed by achieving a competitive result on the Google Landmark Retrieval Challenge 2019. |
Tasks | Image Retrieval, Metric Learning |
Published | 2019-07-27 |
URL | https://arxiv.org/abs/1907.11854v1 |
https://arxiv.org/pdf/1907.11854v1.pdf | |
PWC | https://paperswithcode.com/paper/a-benchmark-on-tricks-for-large-scale-image |
Repo | |
Framework | |
Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery
Title | Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery |
Authors | Xianzhen Li, Zhao Zhang, Yang Wang, Guangcan Liu, Shuicheng Yan, Meng Wang |
Abstract | For subspace recovery, most existing low-rank representation (LRR) models performs in the original space in single-layer mode. As such, the deep hierarchical information cannot be learned, which may result in inaccurate recoveries for complex real data. In this paper, we explore the deep multi-subspace recovery problem by designing a multilayer architecture for latent LRR. Technically, we propose a new Multilayer Collabora-tive Low-Rank Representation Network model termed DeepLRR to discover deep features and deep subspaces. In each layer (>2), DeepLRR bilinearly reconstructs the data matrix by the collabo-rative representation with low-rank coefficients and projection matrices in the previous layer. The bilinear low-rank reconstruc-tion of previous layer is directly fed into the next layer as the input and low-rank dictionary for representation learning, and is further decomposed into a deep principal feature part, a deep salient feature part and a deep sparse error. As such, the coher-ence issue can be also resolved due to the low-rank dictionary, and the robustness against noise can also be enhanced in the feature subspace. To recover the sparse errors in layers accurately, a dynamic growing strategy is used, as the noise level will be-come smaller for the increase of layers. Besides, a neighborhood reconstruction error is also included to encode the locality of deep salient features by deep coefficients adaptively in each layer. Extensive results on public databases show that our DeepLRR outperforms other related models for subspace discovery and clustering. |
Tasks | Representation Learning |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06450v3 |
https://arxiv.org/pdf/1912.06450v3.pdf | |
PWC | https://paperswithcode.com/paper/multilayer-collaborative-low-rank-coding |
Repo | |
Framework | |
What Question Answering can Learn from Trivia Nerds
Title | What Question Answering can Learn from Trivia Nerds |
Authors | Jordan Boyd-Graber |
Abstract | In addition to the traditional task of getting machines to answer questions, a major research question in question answering is to create interesting, challenging questions that can help systems learn how to answer questions and also reveal which systems are the best at answering questions. We argue that creating a question answering dataset—and the ubiquitous leaderboard that goes with it—closely resembles running a trivia tournament: you write questions, have agents (either humans or machines) answer the questions, and declare a winner. However, the research community has ignored the decades of hard-learned lessons from decades of the trivia community creating vibrant, fair, and effective question answering competitions. After detailing problems with existing QA datasets, we outline the key lessons—removing ambiguity, discriminating skill, and adjudicating disputes—that can transfer to QA research and how they might be implemented for the QA community. |
Tasks | Question Answering |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14464v2 |
https://arxiv.org/pdf/1910.14464v2.pdf | |
PWC | https://paperswithcode.com/paper/what-question-answering-can-learn-from-trivia |
Repo | |
Framework | |
Support and Invertibility in Domain-Invariant Representations
Title | Support and Invertibility in Domain-Invariant Representations |
Authors | Fredrik D. Johansson, David Sontag, Rajesh Ranganath |
Abstract | Learning domain-invariant representations has become a popular approach to unsupervised domain adaptation and is often justified by invoking a particular suite of theoretical results. We argue that there are two significant flaws in such arguments. First, the results in question hold only for a fixed representation and do not account for information lost in non-invertible transformations. Second, domain invariance is often a far too strict requirement and does not always lead to consistent estimation, even under strong and favorable assumptions. In this work, we give generalization bounds for unsupervised domain adaptation that hold for any representation function by acknowledging the cost of non-invertibility. In addition, we show that penalizing distance between densities is often wasteful and propose a bound based on measuring the extent to which the support of the source domain covers the target domain. We perform experiments on well-known benchmarks that illustrate the short-comings of current standard practice. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2019-03-08 |
URL | https://arxiv.org/abs/1903.03448v4 |
https://arxiv.org/pdf/1903.03448v4.pdf | |
PWC | https://paperswithcode.com/paper/support-and-invertibility-in-domain-invariant |
Repo | |
Framework | |
Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems
Title | Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems |
Authors | Qiang Huang, Jianhui Bu, Weijian Xie, Shengwen Yang, Weijia Wu, Liping Liu |
Abstract | Question Answering (QA) systems are used to provide proper responses to users’ questions automatically. Sentence matching is an essential task in the QA systems and is usually reformulated as a Paraphrase Identification (PI) problem. Given a question, the aim of the task is to find the most similar question from a QA knowledge base. In this paper, we propose a Multi-task Sentence Encoding Model (MSEM) for the PI problem, wherein a connected graph is employed to depict the relation between sentences, and a multi-task learning model is applied to address both the sentence matching and sentence intent classification problem. In addition, we implement a general semantic retrieval framework that combines our proposed model and the Approximate Nearest Neighbor (ANN) technology, which enables us to find the most similar question from all available candidates very quickly during online serving. The experiments show the superiority of our proposed method as compared with the existing sentence matching models. |
Tasks | Intent Classification, Multi-Task Learning, Paraphrase Identification, Question Answering |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07405v1 |
https://arxiv.org/pdf/1911.07405v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-sentence-encoding-model-for |
Repo | |
Framework | |
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
Title | Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries |
Authors | Jawadul H. Bappy, Cody Simons, Lakshmanan Nataraj, B. S. Manjunath, Amit K. Roy-Chowdhury |
Abstract | With advanced image journaling tools, one can easily alter the semantic meaning of an image by exploiting certain manipulation techniques such as copy-clone, object splicing, and removal, which mislead the viewers. In contrast, the identification of these manipulations becomes a very challenging task as manipulated regions are not visually apparent. This paper proposes a high-confidence manipulation localization architecture which utilizes resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder network to segment out manipulated regions from non-manipulated ones. Resampling features are used to capture artifacts like JPEG quality loss, upsampling, downsampling, rotation, and shearing. The proposed network exploits larger receptive fields (spatial maps) and frequency domain correlation to analyze the discriminative characteristics between manipulated and non-manipulated regions by incorporating encoder and LSTM network. Finally, decoder network learns the mapping from low-resolution feature maps to pixel-wise predictions for image tamper localization. With predicted mask provided by final layer (softmax) of the proposed architecture, end-to-end training is performed to learn the network parameters through back-propagation using ground-truth masks. Furthermore, a large image splicing dataset is introduced to guide the training process. The proposed method is capable of localizing image manipulations at pixel level with high precision, which is demonstrated through rigorous experimentation on three diverse datasets. |
Tasks | |
Published | 2019-03-06 |
URL | http://arxiv.org/abs/1903.02495v1 |
http://arxiv.org/pdf/1903.02495v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-lstm-and-encoder-decoder-architecture |
Repo | |
Framework | |
Semi-supervised Feature-Level Attribute Manipulation for Fashion Image Retrieval
Title | Semi-supervised Feature-Level Attribute Manipulation for Fashion Image Retrieval |
Authors | Minchul Shin, Sanghyuk Park, Taeksoo Kim |
Abstract | With a growing demand for the search by image, many works have studied the task of fashion instance-level image retrieval (FIR). Furthermore, the recent works introduce a concept of fashion attribute manipulation (FAM) which manipulates a specific attribute (e.g color) of a fashion item while maintaining the rest of the attributes (e.g shape, and pattern). In this way, users can search not only “the same” items but also “similar” items with the desired attributes. FAM is a challenging task in that the attributes are hard to define, and the unique characteristics of a query are hard to be preserved. Although both FIR and FAM are important in real-life applications, most of the previous studies have focused on only one of these problem. In this study, we aim to achieve competitive performance on both FIR and FAM. To do so, we propose a novel method that converts a query into a representation with the desired attributes. We introduce a new idea of attribute manipulation at the feature level, by matching the distribution of manipulated features with real features. In this fashion, the attribute manipulation can be done independently from learning a representation from the image. By introducing the feature-level attribute manipulation, the previous methods for FIR can perform attribute manipulation without sacrificing their retrieval performance. |
Tasks | Image Retrieval |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05007v1 |
https://arxiv.org/pdf/1907.05007v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-feature-level-attribute |
Repo | |
Framework | |
Global-Local Metamodel Assisted Two-Stage Optimization via Simulation
Title | Global-Local Metamodel Assisted Two-Stage Optimization via Simulation |
Authors | Wei Xie, Yuan Yi, Hua Zheng |
Abstract | To integrate strategic, tactical and operational decisions, the two-stage optimization has been widely used to guide dynamic decision making. In this paper, we study the two-stage stochastic programming for complex systems with unknown response estimated by simulation. We introduce the global-local metamodel assisted two-stage optimization via simulation that can efficiently employ the simulation resource to iteratively solve for the optimal first- and second-stage decisions. Specifically, at each visited first-stage decision, we develop a local metamodel to simultaneously solve a set of scenario-based second-stage optimization problems, which also allows us to estimate the optimality gap. Then, we construct a global metamodel accounting for the errors induced by: (1) using a finite number of scenarios to approximate the expected future cost occurring in the planning horizon, (2) second-stage optimality gap, and (3) finite visited first-stage decisions. Assisted by the global-local metamodel, we propose a new simulation optimization approach that can efficiently and iteratively search for the optimal first- and second-stage decisions. Our framework can guarantee the convergence of optimal solution for the discrete two-stage optimization with unknown objective, and the empirical study indicates that it achieves substantial efficiency and accuracy. |
Tasks | Decision Making |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05863v1 |
https://arxiv.org/pdf/1910.05863v1.pdf | |
PWC | https://paperswithcode.com/paper/global-local-metamodel-assisted-two-stage |
Repo | |
Framework | |
Generative One-Shot Face Recognition
Title | Generative One-Shot Face Recognition |
Authors | Zhengming Ding, Yandong Guo, Lei Zhang, Yun Fu |
Abstract | One-shot face recognition measures the ability to identify persons with only seeing them at one glance, and is a hallmark of human visual intelligence. It is challenging for conventional machine learning approaches to mimic this way, since limited data are hard to effectively represent the data variance. The goal of one-shot face recognition is to learn a large-scale face recognizer, which is capable to fight off the data imbalance challenge. In this paper, we propose a novel generative adversarial one-shot face recognizer, attempting to synthesize meaningful data for one-shot classes by adapting the data variances from other normal classes. Specifically, we target at building a more effective general face classifier for both normal persons and one-shot persons. Technically, we design a new loss function by formulating knowledge transfer generator and a general classifier into a unified framework. Such a two-player minimax optimization can guide the generation of more effective data, which effectively promote the underrepresented classes in the learned model and lead to a remarkable improvement in face recognition performance. We evaluate our proposed model on the MS-Celeb-1M one-shot learning benchmark task, where we could recognize 94.98% of the test images at the precision of 99% for the one-shot classes, keeping an overall Top1 accuracy at $99.80%$ for the normal classes. To the best of our knowledge, this is the best performance among all the published methods using this benchmark task with the same setup, including all the participants in the recent MS-Celeb-1M challenge at ICCV 2017\footnote{http://www.msceleb.org/challenge2/2017}. |
Tasks | Face Recognition, One-Shot Learning, Transfer Learning |
Published | 2019-09-28 |
URL | https://arxiv.org/abs/1910.04860v1 |
https://arxiv.org/pdf/1910.04860v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-one-shot-face-recognition |
Repo | |
Framework | |
Learning from Indirect Observations
Title | Learning from Indirect Observations |
Authors | Yivan Zhang, Nontawat Charoenphakdee, Masashi Sugiyama |
Abstract | Weakly-supervised learning is a paradigm for alleviating the scarcity of labeled data by leveraging lower-quality but larger-scale supervision signals. While existing work mainly focuses on utilizing a certain type of weak supervision, we present a probabilistic framework, learning from indirect observations, for learning from a wide range of weak supervision in real-world problems, e.g., noisy labels, complementary labels and coarse-grained labels. We propose a general method based on the maximum likelihood principle, which has desirable theoretical properties and can be straightforwardly implemented for deep neural networks. Concretely, a discriminative model for the true target is used for modeling the indirect observation, which is a random variable entirely depending on the true target stochastically or deterministically. Then, maximizing the likelihood given indirect observations leads to an estimator of the true target implicitly. Comprehensive experiments for two novel problem settings — learning from multiclass label proportions and learning from coarse-grained labels, illustrate practical usefulness of our method and demonstrate how to integrate various sources of weak supervision. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04394v1 |
https://arxiv.org/pdf/1910.04394v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-indirect-observations |
Repo | |
Framework | |
Learning walk and trot from the same objective using different types of exploration
Title | Learning walk and trot from the same objective using different types of exploration |
Authors | Zinan Liu, Kai Ploeger, Svenja Stark, Elmar Rueckert, Jan Peters |
Abstract | In quadruped gait learning, policy search methods that scale high dimensional continuous action spaces are commonly used. In most approaches, it is necessary to introduce prior knowledge on the gaits to limit the highly non-convex search space of the policies. In this work, we propose a new approach to encode the symmetry properties of the desired gaits, on the initial covariance of the Gaussian search distribution, allowing for strategic exploration. Using episode-based likelihood ratio policy gradient and relative entropy policy search, we learned the gaits walk and trot on a simulated quadruped. Comparing these gaits to random gaits learned by initialized diagonal covariance matrix, we show that the performance can be significantly enhanced. |
Tasks | |
Published | 2019-04-28 |
URL | http://arxiv.org/abs/1904.12336v1 |
http://arxiv.org/pdf/1904.12336v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-walk-and-trot-from-the-same |
Repo | |
Framework | |
Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms
Title | Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms |
Authors | Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve Gurel, Bo Li, Ce Zhang, Costas J. Spanos, Dawn Song |
Abstract | Given a data set $\mathcal{D}$ containing millions of data points and a data consumer who is willing to pay for $$X$ to train a machine learning (ML) model over $\mathcal{D}$, how should we distribute this $$X$ to each data point to reflect its “value”? In this paper, we define the “relative value of data” via the Shapley value, as it uniquely possesses properties with appealing real-world interpretations, such as fairness, rationality and decentralizability. For general, bounded utility functions, the Shapley value is known to be challenging to compute: to get Shapley values for all $N$ data points, it requires $O(2^N)$ model evaluations for exact computation and $O(N\log N)$ for $(\epsilon, \delta)$-approximation. In this paper, we focus on one popular family of ML models relying on $K$-nearest neighbors ($K$NN). The most surprising result is that for unweighted $K$NN classifiers and regressors, the Shapley value of all $N$ data points can be computed, exactly, in $O(N\log N)$ time – an exponential improvement on computational complexity! Moreover, for $(\epsilon, \delta)$-approximation, we are able to develop an algorithm based on Locality Sensitive Hashing (LSH) with only sublinear complexity $O(N^{h(\epsilon,K)}\log N)$ when $\epsilon$ is not too small and $K$ is not too large. We empirically evaluate our algorithms on up to $10$ million data points and even our exact algorithm is up to three orders of magnitude faster than the baseline approximation algorithm. The LSH-based approximation algorithm can accelerate the value calculation process even further. We then extend our algorithms to other scenarios such as (1) weighed $K$NN classifiers, (2) different data points are clustered by different data curators, and (3) there are data analysts providing computation who also requires proper valuation. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08619v4 |
https://arxiv.org/pdf/1908.08619v4.pdf | |
PWC | https://paperswithcode.com/paper/efficient-task-specific-data-valuation-for |
Repo | |
Framework | |
A Causal Bayesian Networks Viewpoint on Fairness
Title | A Causal Bayesian Networks Viewpoint on Fairness |
Authors | Silvia Chiappa, William S. Isaac |
Abstract | We offer a graphical interpretation of unfairness in a dataset as the presence of an unfair causal path in the causal Bayesian network representing the data-generation mechanism. We use this viewpoint to revisit the recent debate surrounding the COMPAS pretrial risk assessment tool and, more generally, to point out that fairness evaluation on a model requires careful considerations on the patterns of unfairness underlying the training data. We show that causal Bayesian networks provide us with a powerful tool to measure unfairness in a dataset and to design fair models in complex unfairness scenarios. |
Tasks | |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06430v1 |
https://arxiv.org/pdf/1907.06430v1.pdf | |
PWC | https://paperswithcode.com/paper/a-causal-bayesian-networks-viewpoint-on |
Repo | |
Framework | |
Defective samples simulation through Neural Style Transfer for automatic surface defect segment
Title | Defective samples simulation through Neural Style Transfer for automatic surface defect segment |
Authors | Taoran Wei, Danhua Cao, Xingru Jiang, Caiyun Zheng, Lizhe Liu |
Abstract | Owing to the lack of defect samples in industrial product quality inspection, trained segmentation model tends to overfit when applied online. To address this problem, we propose a defect sample simulation algorithm based on neural style transfer. The simulation algorithm requires only a small number of defect samples for training, and can efficiently generate simulation samples for next-step segmentation task. In our work, we introduce a masked histogram matching module to maintain color consistency of the generated area and the true defect. To preserve the texture consistency with the surrounding pixels, we take the fast style transfer algorithm to blend the generated area into the background. At the same time, we also use the histogram loss to further improve the quality of the generated image. Besides, we propose a novel structure of segment net to make it more suitable for defect segmentation task. We train the segment net with the real defect samples and the generated simulation samples separately on the button datasets. The results show that the F1 score of the model trained with only the generated simulation samples reaches 0.80, which is better than the real sample result. |
Tasks | Style Transfer |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03334v1 |
https://arxiv.org/pdf/1910.03334v1.pdf | |
PWC | https://paperswithcode.com/paper/defective-samples-simulation-through-neural |
Repo | |
Framework | |
Linearly Constrained Smoothing Group Sparsity Solvers in Off-grid Model
Title | Linearly Constrained Smoothing Group Sparsity Solvers in Off-grid Model |
Authors | Cheng-Yu Hung, Mostafa Kaveh |
Abstract | In compressed sensing, the sensing matrix is assumed perfectly known. However, there exists perturbation in the sensing matrix in reality due to sensor offsets or noise disturbance. Directions-of-arrival (DoA) estimation with off-grid effect satisfies this situation, and can be formulated into a (non)convex optimization problem with linear inequalities constraints, which can be solved by the interior point method (using the CVX tools), but at a large computational cost. In this work, in order to design efficient algorithms, we consider various alternative formulations, such as unconstrained formulation, primal-dual formulation, or conic formulation to develop group-sparsity promoted solvers. First, the consensus alternating direction method of multipliers (C-ADMM) is applied. Then, iterative algorithms for the BPDN formulation is proposed by combining the Nesterov smoothing technique with accelerated proximal gradient method, and the convergence analysis of the method is conducted as well. We also developed a variant of EGT (Excessive Gap Technique)-based primal-dual method to systematically reduce the smoothing parameter sequentially. Finally, we propose algorithms for quadratically constrained L2-L1 mixed norm minimization problem by using the smoothed dual conic optimization (SDCO) and continuation technique. The performance of accuracy and convergence for all the proposed methods are demonstrated in the numerical simulations. |
Tasks | |
Published | 2019-03-17 |
URL | https://arxiv.org/abs/1903.07164v2 |
https://arxiv.org/pdf/1903.07164v2.pdf | |
PWC | https://paperswithcode.com/paper/linearly-constrained-smoothing-group-sparsity |
Repo | |
Framework | |