Paper Group NAWR 10
A Condition Number for Joint Optimization of Cycle-Consistent Networks. On Fenchel Mini-Max Learning. Reinforced Training Data Selection for Domain Adaptation. Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification. Exemplar based underwater image enhancement augmented by Wavelet Corrected Transforms. Reliability-awar …
A Condition Number for Joint Optimization of Cycle-Consistent Networks
Title | A Condition Number for Joint Optimization of Cycle-Consistent Networks |
Authors | Leonidas J. Guibas, Qixing Huang, Zhenxiao Liang |
Abstract | A recent trend in optimizing maps such as dense correspondences between objects or neural networks between pairs of domains is to optimize them jointly. In this context, there is a natural \textsl{cycle-consistency} constraint, which regularizes composite maps associated with cycles, i.e., they are forced to be identity maps. However, as there is an exponential number of cycles in a graph, how to sample a subset of cycles becomes critical for efficient and effective enforcement of the cycle-consistency constraint. This paper presents an algorithm that select a subset of weighted cycles to minimize a condition number of the induced joint optimization problem. Experimental results on benchmark datasets justify the effectiveness of our approach for optimizing dense correspondences between 3D shapes and neural networks for predicting dense image flows. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8386-a-condition-number-for-joint-optimization-of-cycle-consistent-networks |
http://papers.nips.cc/paper/8386-a-condition-number-for-joint-optimization-of-cycle-consistent-networks.pdf | |
PWC | https://paperswithcode.com/paper/a-condition-number-for-joint-optimization-of |
Repo | https://github.com/huangqx/NeurIPS19_Cycle |
Framework | none |
On Fenchel Mini-Max Learning
Title | On Fenchel Mini-Max Learning |
Authors | Chenyang Tao, Liqun Chen, Shuyang Dai, Junya Chen, Ke Bai, Dong Wang, Jianfeng Feng, Wenlian Lu, Georgiy Bobashev, Lawrence Carin |
Abstract | Inference, estimation, sampling and likelihood evaluation are four primary goals of probabilistic modeling. Practical considerations often force modeling approaches to make compromises between these objectives. We present a novel probabilistic learning framework, called Fenchel Mini-Max Learning (FML), that accommodates all four desiderata in a flexible and scalable manner. Our derivation is rooted in classical maximum likelihood estimation, and it overcomes a longstanding challenge that prevents unbiased estimation of unnormalized statistical models. By reformulating MLE as a mini-max game, FML enjoys an unbiased training objective that (i) does not explicitly involve the intractable normalizing constant and (ii) is directly amendable to stochastic gradient descent optimization. To demonstrate the utility of the proposed approach, we consider learning unnormalized statistical models, nonparametric density estimation and training generative models, with encouraging empirical results presented. |
Tasks | Density Estimation |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9230-on-fenchel-mini-max-learning |
http://papers.nips.cc/paper/9230-on-fenchel-mini-max-learning.pdf | |
PWC | https://paperswithcode.com/paper/on-fenchel-mini-max-learning |
Repo | https://github.com/chenyang-tao/FML |
Framework | none |
Reinforced Training Data Selection for Domain Adaptation
Title | Reinforced Training Data Selection for Domain Adaptation |
Authors | Miaofeng Liu, Yan Song, Hongbin Zou, Tong Zhang |
Abstract | Supervised models suffer from the problem of domain shifting where distribution mismatch in the data across domains greatly affect model performance. To solve the problem, training data selection (TDS) has been proven to be a prospective solution for domain adaptation in leveraging appropriate data. However, conventional TDS methods normally requires a predefined threshold which is neither easy to set nor can be applied across tasks, and models are trained separately with the TDS process. To make TDS self-adapted to data and task, and to combine it with model training, in this paper, we propose a reinforcement learning (RL) framework that synchronously searches for training instances relevant to the target domain and learns better representations for them. A selection distribution generator (SDG) is designed to perform the selection and is updated according to the rewards computed from the selected data, where a predictor is included in the framework to ensure a task-specific model can be trained on the selected data and provides feedback to rewards. Experimental results from part-of-speech tagging, dependency parsing, and sentiment analysis, as well as ablation studies, illustrate that the proposed framework is not only effective in data selection and representation, but also generalized to accommodate different NLP tasks. |
Tasks | Dependency Parsing, Domain Adaptation, Part-Of-Speech Tagging, Sentiment Analysis |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1189/ |
https://www.aclweb.org/anthology/P19-1189 | |
PWC | https://paperswithcode.com/paper/reinforced-training-data-selection-for-domain |
Repo | https://github.com/timerstime/SDG4DA |
Framework | tf |
Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification
Title | Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification |
Authors | Qize Yang, Hong-Xing Yu, Ancong Wu, Wei-Shi Zheng |
Abstract | While discriminative local features have been shown effective in solving the person re-identification problem, they are limited to be trained on fully pairwise labelled data which is expensive to obtain. In this work, we overcome this problem by proposing a patch-based unsupervised learning framework in order to learn discriminative feature from patches instead of the whole images. The patch-based learning leverages similarity between patches to learn a discriminative model. Specifically, we develop a PatchNet to select patches from the feature map and learn discriminative features for these patches. To provide effective guidance for the PatchNet to learn discriminative patch feature on unlabeled datasets, we propose an unsupervised patch-based discriminative feature learning loss. In addition, we design an image-level feature learning loss to leverage all the patch features of the same image to serve as an image-level guidance for the PatchNet. Extensive experiments validate the superiority of our method for unsupervised person re-id. Our code is available at https://github.com/QizeYang/PAUL. |
Tasks | Person Re-Identification, Unsupervised Person Re-Identification |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Yang_Patch-Based_Discriminative_Feature_Learning_for_Unsupervised_Person_Re-Identification_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Yang_Patch-Based_Discriminative_Feature_Learning_for_Unsupervised_Person_Re-Identification_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/patch-based-discriminative-feature-learning |
Repo | https://github.com/QizeYang/PAUL |
Framework | pytorch |
Exemplar based underwater image enhancement augmented by Wavelet Corrected Transforms
Title | Exemplar based underwater image enhancement augmented by Wavelet Corrected Transforms |
Authors | Adarsh Jamadandi, Uma Mudenagudi |
Abstract | In this paper we propose a novel deep learning framework to enhance underwater images by augmenting our network with wavelet corrected transformations. Wavelet transforms have recently made way into deep learning frameworks and their ability to reconstruct arbitrary signals accurately makes them favourable for many applications. Underwater images are subjected to unique distortions, this is mainly attributed to the fact that red wave- length light gets absorbed dominantly giving a greenish, blue hue. This wavelength dependent selective absorption of light and also scattering by the suspended particles introduce non-linear distortions that affect the quality of the images. We propose an encoder-decoder module with wavelet pooling and unpooling as one of the network components to perform progressive whitening and coloring transforms to enhance underwater images via realistic style transfer. We give a sound theoretical proof as to why wavelet transforms are better for signal reconstruction. We demonstrate our proposed framework on popular underwater images dataset and evaluate it using metrics like SSIM, PSNR and UCIQE and show that we achieve state-of-the-art results compared to those mentioned in the literature. |
Tasks | Image Enhancement, Style Transfer |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPRW_2019/html/AAMVEM/Jamadandi_Exemplar-based_Underwater_Image_Enhancement_Augmented_by_Wavelet_Corrected_Transforms_CVPRW_2019_paper.html |
http://openaccess.thecvf.com/content_CVPRW_2019/html/AAMVEM/Jamadandi_Exemplar-based_Underwater_Image_Enhancement_Augmented_by_Wavelet_Corrected_Transforms_CVPRW_2019_paper.html | |
PWC | https://paperswithcode.com/paper/exemplar-based-underwater-image-enhancement |
Repo | https://github.com/AdarshMJ/Underwater-Image-Enhancement-via-Style-Transfer |
Framework | pytorch |
Reliability-aware Dynamic Feature Composition for Name Tagging
Title | Reliability-aware Dynamic Feature Composition for Name Tagging |
Authors | Ying Lin, Liyuan Liu, Heng Ji, Dong Yu, Jiawei Han |
Abstract | Word embeddings are widely used on a variety of tasks and can substantially improve the performance. However, their quality is not consistent throughout the vocabulary due to the long-tail distribution of word frequency. Without sufficient contexts, rare word embeddings are usually less reliable than those of common words. However, current models typically trust all word embeddings equally regardless of their reliability and thus may introduce noise and hurt the performance. Since names often contain rare and uncommon words, this problem is particularly critical for name tagging. In this paper, we propose a novel reliability-aware name tagging model to tackle this issue. We design a set of word frequency-based reliability signals to indicate the quality of each word embedding. Guided by the reliability signals, the model is able to dynamically select and compose features such as word embedding and character-level representation using gating mechanisms. For example, if an input word is rare, the model relies less on its word embedding and assigns higher weights to its character and contextual features. Experiments on OntoNotes 5.0 show that our model outperforms the baseline model by 2.7{%} absolute gain in F-score. In cross-genre experiments on five genres in OntoNotes, our model improves the performance for most genre pairs and obtains up to 5{%} absolute F-score gain. |
Tasks | Named Entity Recognition, Word Embeddings |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1016/ |
https://www.aclweb.org/anthology/P19-1016 | |
PWC | https://paperswithcode.com/paper/reliability-aware-dynamic-feature-composition |
Repo | https://github.com/limteng-rpi/neural_name_tagging |
Framework | pytorch |
Relation Extraction with Temporal Reasoning Based on Memory Augmented Distant Supervision
Title | Relation Extraction with Temporal Reasoning Based on Memory Augmented Distant Supervision |
Authors | Jianhao Yan, Lin He, Ruqin Huang, Jian Li, Ying Liu |
Abstract | Distant supervision (DS) is an important paradigm for automatically extracting relations. It utilizes existing knowledge base to collect examples for the relation we intend to extract, and then uses these examples to automatically generate the training data. However, the examples collected can be very noisy, and pose significant challenge for obtaining high quality labels. Previous work has made remarkable progress in predicting the relation from distant supervision, but typically ignores the temporal relations among those supervising instances. This paper formulates the problem of relation extraction with temporal reasoning and proposes a solution to predict whether two given entities participate in a relation at a given time spot. For this purpose, we construct a dataset called WIKI-TIME which additionally includes the valid period of a certain relation of two entities in the knowledge base. We propose a novel neural model to incorporate both the temporal information encoding and sequential reasoning. The experimental results show that, compared with the best of existing models, our model achieves better performance in both WIKI-TIME dataset and the well-studied NYT-10 dataset. |
Tasks | Relation Extraction |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1107/ |
https://www.aclweb.org/anthology/N19-1107 | |
PWC | https://paperswithcode.com/paper/relation-extraction-with-temporal-reasoning |
Repo | https://github.com/ElliottYan/DS_Temporal |
Framework | pytorch |
Scaling Recurrent Models via Orthogonal Approximations in Tensor Trains
Title | Scaling Recurrent Models via Orthogonal Approximations in Tensor Trains |
Authors | Ronak Mehta, Rudrasis Chakraborty, Yunyang Xiong, Vikas Singh |
Abstract | Modern deep networks have proven to be very effective for analyzing real world images. However, their application in medical imaging is still in its early stages, primarily due to the large size of three-dimensional images, requiring enormous convolutional or fully connected layers - if we treat an image (and not image patches) as a sample. These issues only compound when the focus moves towards longitudinal analysis of 3D image volumes through recurrent structures, and when a point estimate of model parameters is insufficient in scientific applications where a reliability measure is necessary. Using insights from differential geometry, we adapt the tensor train decomposition to construct networks with significantly fewer parameters, allowing us to train powerful recurrent networks on whole brain image volume sequences. We describe the “orthogonal” tensor train, and demonstrate its ability to express a standard network layer both theoretically and empirically. We show its ability to effectively reconstruct whole brain volumes with faster convergence and stronger confidence intervals compared to the standard tensor train decomposition. We provide code and show experiments on the ADNI dataset using image sequences to regress on a cognition related outcome. |
Tasks | |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Mehta_Scaling_Recurrent_Models_via_Orthogonal_Approximations_in_Tensor_Trains_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Mehta_Scaling_Recurrent_Models_via_Orthogonal_Approximations_in_Tensor_Trains_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/scaling-recurrent-models-via-orthogonal |
Repo | https://github.com/ronakrm/OTT |
Framework | tf |
Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases
Title | Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases |
Authors | Xiyang Liu, Sewoong Oh |
Abstract | Differential privacy has become a widely accepted notion of privacy, leading to the introduction and deployment of numerous privatization mechanisms. However, ensuring the privacy guarantee is an error-prone process, both in designing mechanisms and in implementing those mechanisms. Both types of errors will be greatly reduced, if we have a data-driven approach to verify privacy guarantees, from a black-box access to a mechanism. We pose it as a property estimation problem, and study the fundamental trade-offs involved in the accuracy in estimated privacy guarantees and the number of samples required. We introduce a novel estimator that uses polynomial approximation of a carefully chosen degree to optimally trade-off bias and variance. With n samples, we show that this estimator achieves performance of a straightforward plug-in estimator with n*log(n) samples, a phenomenon referred to as effective sample size amplification. The minimax optimality of the proposed estimator is proved by comparing it to a matching fundamental lower bound. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8512-minimax-optimal-estimation-of-approximate-differential-privacy-on-neighboring-databases |
http://papers.nips.cc/paper/8512-minimax-optimal-estimation-of-approximate-differential-privacy-on-neighboring-databases.pdf | |
PWC | https://paperswithcode.com/paper/minimax-optimal-estimation-of-approximate |
Repo | https://github.com/xiyangl3/adp-estimator |
Framework | none |
RNN Embeddings for Identifying Difficult to Understand Medical Words
Title | RNN Embeddings for Identifying Difficult to Understand Medical Words |
Authors | Hanna Pylieva, Artem Chernodub, Natalia Grabar, Thierry Hamon |
Abstract | Patients and their families often require a better understanding of medical information provided by doctors. We currently address this issue by improving the identification of difficult to understand medical words. We introduce novel embeddings received from RNN - FrnnMUTE (French RNN Medical Understandability Text Embeddings) which allow to reach up to 87.0 F1 score in identification of difficult words. We also note that adding pre-trained FastText word embeddings to the feature set substantially improves the performance of the model which classifies words according to their difficulty. We study the generalizability of different models through three cross-validation scenarios which allow testing classifiers in real-world conditions: understanding of medical words by new users, and classification of new unseen words by the automatic models. The RNN - FrnnMUTE embeddings and the categorization code are being made available for the research. |
Tasks | Word Embeddings |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5011/ |
https://www.aclweb.org/anthology/W19-5011 | |
PWC | https://paperswithcode.com/paper/rnn-embeddings-for-identifying-difficult-to |
Repo | https://github.com/hpylieva/FrnnMUTE |
Framework | none |
Putting Evaluation in Context: Contextual Embeddings Improve Machine Translation Evaluation
Title | Putting Evaluation in Context: Contextual Embeddings Improve Machine Translation Evaluation |
Authors | Nitika Mathur, Timothy Baldwin, Trevor Cohn |
Abstract | Accurate, automatic evaluation of machine translation is critical for system tuning, and evaluating progress in the field. We proposed a simple unsupervised metric, and additional supervised metrics which rely on contextual word embeddings to encode the translation and reference sentences. We find that these models rival or surpass all existing metrics in the WMT 2017 sentence-level and system-level tracks, and our trained model has a substantially higher correlation with human judgements than all existing metrics on the WMT 2017 to-English sentence level dataset. |
Tasks | Machine Translation, Word Embeddings |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1269/ |
https://www.aclweb.org/anthology/P19-1269 | |
PWC | https://paperswithcode.com/paper/putting-evaluation-in-context-contextual |
Repo | https://github.com/nitikam/mteval-in-context |
Framework | none |
Globally Optimal Learning for Structured Elliptical Losses
Title | Globally Optimal Learning for Structured Elliptical Losses |
Authors | Yoav Wald, Nofar Noy, Gal Elidan, Ami Wiesel |
Abstract | Heavy tailed and contaminated data are common in various applications of machine learning. A standard technique to handle regression tasks that involve such data, is to use robust losses, e.g., the popular Huber’s loss. In structured problems, however, where there are multiple labels and structural constraints on the labels are imposed (or learned), robust optimization is challenging, and more often than not the loss used is simply the negative log-likelihood of a Gaussian Markov random field. Heavy tailed and contaminated data are common in various applications of machine learning. A standard technique to handle regression tasks that involve such data, is to use robust losses, e.g., the popular Huber’s loss. In structured problems, however, where there are multiple labels and structural constraints on the labels are imposed (or learned), robust optimization is challenging, and more often than not the loss used is simply the negative log-likelihood of a Gaussian Markov random field. In this work, we analyze robust alternatives. Theoretical understanding of such problems is quite limited, with guarantees on optimization given only for special cases and non-structured settings. The core of the difficulty is the non-convexity of the objective function, implying that standard optimization algorithms may converge to sub-optimal critical points. Our analysis focuses on loss functions that arise from elliptical distributions, which appealingly include most loss functions proposed in the literature as special cases. We show that, even though these problems are non-convex, they can be optimized efficiently. Concretely, we prove that at the limit of infinite training data, due to algebraic properties of the problem, all stationary points are globally optimal. Finally, we demonstrate the empirical appeal of using these losses for regression on synthetic and real-life data. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9504-globally-optimal-learning-for-structured-elliptical-losses |
http://papers.nips.cc/paper/9504-globally-optimal-learning-for-structured-elliptical-losses.pdf | |
PWC | https://paperswithcode.com/paper/globally-optimal-learning-for-structured |
Repo | https://github.com/yowald/elliptical-losses |
Framework | tf |
Trivializations for Gradient-Based Optimization on Manifolds
Title | Trivializations for Gradient-Based Optimization on Manifolds |
Authors | Mario Lezcano Casado |
Abstract | We introduce a framework to study the transformation of problems with manifold constraints into unconstrained problems through parametrizations in terms of a Euclidean space. We call these parametrizations trivializations. We prove conditions under which a trivialization is sound in the context of gradient-based optimization and we show how two large families of trivializations have overall favorable properties, but also suffer from a performance issue. We then introduce dynamic trivializations, which solve this problem, and we show how these form a family of optimization methods that lie between trivializations and Riemannian gradient descent, and combine the benefits of both of them. We then show how to implement these two families of trivializations in practice for different matrix manifolds. To this end, we prove a formula for the gradient of the exponential of matrices, which can be of practical interest on its own. Finally, we show how dynamic trivializations improve the performance of existing methods on standard tasks designed to test long-term memory within neural networks. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9115-trivializations-for-gradient-based-optimization-on-manifolds |
http://papers.nips.cc/paper/9115-trivializations-for-gradient-based-optimization-on-manifolds.pdf | |
PWC | https://paperswithcode.com/paper/trivializations-for-gradient-based-1 |
Repo | https://github.com/Lezcano/expRNN |
Framework | pytorch |
Learning to Learn By Self-Critique
Title | Learning to Learn By Self-Critique |
Authors | Antreas Antoniou, Amos J. Storkey |
Abstract | In few-shot learning, a machine learning system is required to learn from a small set of labelled examples of a specific task, such that it can achieve strong generalization on new unlabelled examples of the same task. Given the limited availability of labelled examples in such tasks, we need to make use of all the information we can. For this reason we propose the use of transductive meta-learning for few shot settings to obtain state-of-the-art few-shot learning. Usually a model learns task-specific information from a small training-set (the \emph{support-set}) and subsequently produces predictions on a small unlabelled validation set (\emph{target-set}). The target-set contains additional task-specific information which is not utilized by existing few-shot learning methods. This is a challenge requiring approaches beyond the current methods as at inference time, the target-set contains only input data-points, and so discriminative-based learning cannot be used. In this paper, we propose a framework called \emph{Self-Critique and Adapt} or SCA. This approach learns to learn a label-free loss function, parameterized as a neural network, which leverages target-set information. A base-model learns on a support-set using existing methods (e.g. stochastic gradient descent combined with the cross-entropy loss), and then is updated for the incoming target-task using a new learned loss function (i.e. the meta-learned label-free loss). This unsupervised loss function is optimized such that the learnt model achieves higher generalization performance. Experiments demonstrate that SCA offers substantially higher and state-of-the-art generalization performance compared to baselines which only adapt on the support-set. |
Tasks | Few-Shot Learning, Meta-Learning |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9185-learning-to-learn-by-self-critique |
http://papers.nips.cc/paper/9185-learning-to-learn-by-self-critique.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-learn-by-self-critique-1 |
Repo | https://github.com/AntreasAntoniou/Learning_to_Learn_via_Self-Critique |
Framework | pytorch |
On Relating Explanations and Adversarial Examples
Title | On Relating Explanations and Adversarial Examples |
Authors | Alexey Ignatiev, Nina Narodytska, Joao Marques-Silva |
Abstract | The importance of explanations (XP’s) of machine learning (ML) model predictions and of adversarial examples (AE’s) cannot be overstated, with both arguably being essential for the practical success of ML in different settings. There has been recent work on understanding and assessing the relationship between XP’s and AE’s. However, such work has been mostly experimental and a sound theoretical relationship has been elusive. This paper demonstrates that explanations and adversarial examples are related by a generalized form of hitting set duality, which extends earlier work on hitting set duality observed in model-based diagnosis and knowledge compilation. Furthermore, the paper proposes algorithms, which enable computing adversarial examples from explanations and vice-versa. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9717-on-relating-explanations-and-adversarial-examples |
http://papers.nips.cc/paper/9717-on-relating-explanations-and-adversarial-examples.pdf | |
PWC | https://paperswithcode.com/paper/on-relating-explanations-and-adversarial |
Repo | https://github.com/alexeyignatiev/xpce-duality |
Framework | none |