January 25, 2020

3209 words 16 mins read

Paper Group NAWR 10

A Condition Number for Joint Optimization of Cycle-Consistent Networks. On Fenchel Mini-Max Learning. Reinforced Training Data Selection for Domain Adaptation. Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification. Exemplar based underwater image enhancement augmented by Wavelet Corrected Transforms. Reliability-awar …

A Condition Number for Joint Optimization of Cycle-Consistent Networks


Title	A Condition Number for Joint Optimization of Cycle-Consistent Networks
Authors	Leonidas J. Guibas, Qixing Huang, Zhenxiao Liang
Abstract	A recent trend in optimizing maps such as dense correspondences between objects or neural networks between pairs of domains is to optimize them jointly. In this context, there is a natural \textsl{cycle-consistency} constraint, which regularizes composite maps associated with cycles, i.e., they are forced to be identity maps. However, as there is an exponential number of cycles in a graph, how to sample a subset of cycles becomes critical for efficient and effective enforcement of the cycle-consistency constraint. This paper presents an algorithm that select a subset of weighted cycles to minimize a condition number of the induced joint optimization problem. Experimental results on benchmark datasets justify the effectiveness of our approach for optimizing dense correspondences between 3D shapes and neural networks for predicting dense image flows.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/8386-a-condition-number-for-joint-optimization-of-cycle-consistent-networks
PDF	http://papers.nips.cc/paper/8386-a-condition-number-for-joint-optimization-of-cycle-consistent-networks.pdf
PWC	https://paperswithcode.com/paper/a-condition-number-for-joint-optimization-of
Repo	https://github.com/huangqx/NeurIPS19_Cycle
Framework	none

On Fenchel Mini-Max Learning


Title	On Fenchel Mini-Max Learning
Authors	Chenyang Tao, Liqun Chen, Shuyang Dai, Junya Chen, Ke Bai, Dong Wang, Jianfeng Feng, Wenlian Lu, Georgiy Bobashev, Lawrence Carin
Abstract	Inference, estimation, sampling and likelihood evaluation are four primary goals of probabilistic modeling. Practical considerations often force modeling approaches to make compromises between these objectives. We present a novel probabilistic learning framework, called Fenchel Mini-Max Learning (FML), that accommodates all four desiderata in a flexible and scalable manner. Our derivation is rooted in classical maximum likelihood estimation, and it overcomes a longstanding challenge that prevents unbiased estimation of unnormalized statistical models. By reformulating MLE as a mini-max game, FML enjoys an unbiased training objective that (i) does not explicitly involve the intractable normalizing constant and (ii) is directly amendable to stochastic gradient descent optimization. To demonstrate the utility of the proposed approach, we consider learning unnormalized statistical models, nonparametric density estimation and training generative models, with encouraging empirical results presented.
Tasks	Density Estimation
Published	2019-12-01
URL	http://papers.nips.cc/paper/9230-on-fenchel-mini-max-learning
PDF	http://papers.nips.cc/paper/9230-on-fenchel-mini-max-learning.pdf
PWC	https://paperswithcode.com/paper/on-fenchel-mini-max-learning
Repo	https://github.com/chenyang-tao/FML
Framework	none

Reinforced Training Data Selection for Domain Adaptation


Title	Reinforced Training Data Selection for Domain Adaptation
Authors	Miaofeng Liu, Yan Song, Hongbin Zou, Tong Zhang
Abstract	Supervised models suffer from the problem of domain shifting where distribution mismatch in the data across domains greatly affect model performance. To solve the problem, training data selection (TDS) has been proven to be a prospective solution for domain adaptation in leveraging appropriate data. However, conventional TDS methods normally requires a predefined threshold which is neither easy to set nor can be applied across tasks, and models are trained separately with the TDS process. To make TDS self-adapted to data and task, and to combine it with model training, in this paper, we propose a reinforcement learning (RL) framework that synchronously searches for training instances relevant to the target domain and learns better representations for them. A selection distribution generator (SDG) is designed to perform the selection and is updated according to the rewards computed from the selected data, where a predictor is included in the framework to ensure a task-specific model can be trained on the selected data and provides feedback to rewards. Experimental results from part-of-speech tagging, dependency parsing, and sentiment analysis, as well as ablation studies, illustrate that the proposed framework is not only effective in data selection and representation, but also generalized to accommodate different NLP tasks.
Tasks	Dependency Parsing, Domain Adaptation, Part-Of-Speech Tagging, Sentiment Analysis
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1189/
PDF	https://www.aclweb.org/anthology/P19-1189
PWC	https://paperswithcode.com/paper/reinforced-training-data-selection-for-domain
Repo	https://github.com/timerstime/SDG4DA
Framework	tf

Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification


Title	Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification
Authors	Qize Yang, Hong-Xing Yu, Ancong Wu, Wei-Shi Zheng
Abstract	While discriminative local features have been shown effective in solving the person re-identification problem, they are limited to be trained on fully pairwise labelled data which is expensive to obtain. In this work, we overcome this problem by proposing a patch-based unsupervised learning framework in order to learn discriminative feature from patches instead of the whole images. The patch-based learning leverages similarity between patches to learn a discriminative model. Specifically, we develop a PatchNet to select patches from the feature map and learn discriminative features for these patches. To provide effective guidance for the PatchNet to learn discriminative patch feature on unlabeled datasets, we propose an unsupervised patch-based discriminative feature learning loss. In addition, we design an image-level feature learning loss to leverage all the patch features of the same image to serve as an image-level guidance for the PatchNet. Extensive experiments validate the superiority of our method for unsupervised person re-id. Our code is available at https://github.com/QizeYang/PAUL.
Tasks	Person Re-Identification, Unsupervised Person Re-Identification
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Yang_Patch-Based_Discriminative_Feature_Learning_for_Unsupervised_Person_Re-Identification_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Yang_Patch-Based_Discriminative_Feature_Learning_for_Unsupervised_Person_Re-Identification_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/patch-based-discriminative-feature-learning
Repo	https://github.com/QizeYang/PAUL
Framework	pytorch

Exemplar based underwater image enhancement augmented by Wavelet Corrected Transforms


Title	Exemplar based underwater image enhancement augmented by Wavelet Corrected Transforms
Authors	Adarsh Jamadandi, Uma Mudenagudi
Abstract	In this paper we propose a novel deep learning framework to enhance underwater images by augmenting our network with wavelet corrected transformations. Wavelet transforms have recently made way into deep learning frameworks and their ability to reconstruct arbitrary signals accurately makes them favourable for many applications. Underwater images are subjected to unique distortions, this is mainly attributed to the fact that red wave- length light gets absorbed dominantly giving a greenish, blue hue. This wavelength dependent selective absorption of light and also scattering by the suspended particles introduce non-linear distortions that affect the quality of the images. We propose an encoder-decoder module with wavelet pooling and unpooling as one of the network components to perform progressive whitening and coloring transforms to enhance underwater images via realistic style transfer. We give a sound theoretical proof as to why wavelet transforms are better for signal reconstruction. We demonstrate our proposed framework on popular underwater images dataset and evaluate it using metrics like SSIM, PSNR and UCIQE and show that we achieve state-of-the-art results compared to those mentioned in the literature.
Tasks	Image Enhancement, Style Transfer
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPRW_2019/html/AAMVEM/Jamadandi_Exemplar-based_Underwater_Image_Enhancement_Augmented_by_Wavelet_Corrected_Transforms_CVPRW_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPRW_2019/html/AAMVEM/Jamadandi_Exemplar-based_Underwater_Image_Enhancement_Augmented_by_Wavelet_Corrected_Transforms_CVPRW_2019_paper.html
PWC	https://paperswithcode.com/paper/exemplar-based-underwater-image-enhancement
Repo	https://github.com/AdarshMJ/Underwater-Image-Enhancement-via-Style-Transfer
Framework	pytorch

Reliability-aware Dynamic Feature Composition for Name Tagging


Title	Reliability-aware Dynamic Feature Composition for Name Tagging
Authors	Ying Lin, Liyuan Liu, Heng Ji, Dong Yu, Jiawei Han
Abstract	Word embeddings are widely used on a variety of tasks and can substantially improve the performance. However, their quality is not consistent throughout the vocabulary due to the long-tail distribution of word frequency. Without sufficient contexts, rare word embeddings are usually less reliable than those of common words. However, current models typically trust all word embeddings equally regardless of their reliability and thus may introduce noise and hurt the performance. Since names often contain rare and uncommon words, this problem is particularly critical for name tagging. In this paper, we propose a novel reliability-aware name tagging model to tackle this issue. We design a set of word frequency-based reliability signals to indicate the quality of each word embedding. Guided by the reliability signals, the model is able to dynamically select and compose features such as word embedding and character-level representation using gating mechanisms. For example, if an input word is rare, the model relies less on its word embedding and assigns higher weights to its character and contextual features. Experiments on OntoNotes 5.0 show that our model outperforms the baseline model by 2.7{%} absolute gain in F-score. In cross-genre experiments on five genres in OntoNotes, our model improves the performance for most genre pairs and obtains up to 5{%} absolute F-score gain.
Tasks	Named Entity Recognition, Word Embeddings
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1016/
PDF	https://www.aclweb.org/anthology/P19-1016
PWC	https://paperswithcode.com/paper/reliability-aware-dynamic-feature-composition
Repo	https://github.com/limteng-rpi/neural_name_tagging
Framework	pytorch

Relation Extraction with Temporal Reasoning Based on Memory Augmented Distant Supervision


Title	Relation Extraction with Temporal Reasoning Based on Memory Augmented Distant Supervision
Authors	Jianhao Yan, Lin He, Ruqin Huang, Jian Li, Ying Liu
Abstract	Distant supervision (DS) is an important paradigm for automatically extracting relations. It utilizes existing knowledge base to collect examples for the relation we intend to extract, and then uses these examples to automatically generate the training data. However, the examples collected can be very noisy, and pose significant challenge for obtaining high quality labels. Previous work has made remarkable progress in predicting the relation from distant supervision, but typically ignores the temporal relations among those supervising instances. This paper formulates the problem of relation extraction with temporal reasoning and proposes a solution to predict whether two given entities participate in a relation at a given time spot. For this purpose, we construct a dataset called WIKI-TIME which additionally includes the valid period of a certain relation of two entities in the knowledge base. We propose a novel neural model to incorporate both the temporal information encoding and sequential reasoning. The experimental results show that, compared with the best of existing models, our model achieves better performance in both WIKI-TIME dataset and the well-studied NYT-10 dataset.
Tasks	Relation Extraction
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1107/
PDF	https://www.aclweb.org/anthology/N19-1107
PWC	https://paperswithcode.com/paper/relation-extraction-with-temporal-reasoning
Repo	https://github.com/ElliottYan/DS_Temporal
Framework	pytorch

Scaling Recurrent Models via Orthogonal Approximations in Tensor Trains


Title	Scaling Recurrent Models via Orthogonal Approximations in Tensor Trains
Authors	Ronak Mehta, Rudrasis Chakraborty, Yunyang Xiong, Vikas Singh
Abstract	Modern deep networks have proven to be very effective for analyzing real world images. However, their application in medical imaging is still in its early stages, primarily due to the large size of three-dimensional images, requiring enormous convolutional or fully connected layers - if we treat an image (and not image patches) as a sample. These issues only compound when the focus moves towards longitudinal analysis of 3D image volumes through recurrent structures, and when a point estimate of model parameters is insufficient in scientific applications where a reliability measure is necessary. Using insights from differential geometry, we adapt the tensor train decomposition to construct networks with significantly fewer parameters, allowing us to train powerful recurrent networks on whole brain image volume sequences. We describe the “orthogonal” tensor train, and demonstrate its ability to express a standard network layer both theoretically and empirically. We show its ability to effectively reconstruct whole brain volumes with faster convergence and stronger confidence intervals compared to the standard tensor train decomposition. We provide code and show experiments on the ADNI dataset using image sequences to regress on a cognition related outcome.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Mehta_Scaling_Recurrent_Models_via_Orthogonal_Approximations_in_Tensor_Trains_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Mehta_Scaling_Recurrent_Models_via_Orthogonal_Approximations_in_Tensor_Trains_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/scaling-recurrent-models-via-orthogonal
Repo	https://github.com/ronakrm/OTT
Framework	tf

Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases


Title	Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases
Authors	Xiyang Liu, Sewoong Oh
Abstract	Differential privacy has become a widely accepted notion of privacy, leading to the introduction and deployment of numerous privatization mechanisms. However, ensuring the privacy guarantee is an error-prone process, both in designing mechanisms and in implementing those mechanisms. Both types of errors will be greatly reduced, if we have a data-driven approach to verify privacy guarantees, from a black-box access to a mechanism. We pose it as a property estimation problem, and study the fundamental trade-offs involved in the accuracy in estimated privacy guarantees and the number of samples required. We introduce a novel estimator that uses polynomial approximation of a carefully chosen degree to optimally trade-off bias and variance. With n samples, we show that this estimator achieves performance of a straightforward plug-in estimator with n*log(n) samples, a phenomenon referred to as effective sample size amplification. The minimax optimality of the proposed estimator is proved by comparing it to a matching fundamental lower bound.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/8512-minimax-optimal-estimation-of-approximate-differential-privacy-on-neighboring-databases
PDF	http://papers.nips.cc/paper/8512-minimax-optimal-estimation-of-approximate-differential-privacy-on-neighboring-databases.pdf
PWC	https://paperswithcode.com/paper/minimax-optimal-estimation-of-approximate
Repo	https://github.com/xiyangl3/adp-estimator
Framework	none

RNN Embeddings for Identifying Difficult to Understand Medical Words


Title	RNN Embeddings for Identifying Difficult to Understand Medical Words
Authors	Hanna Pylieva, Artem Chernodub, Natalia Grabar, Thierry Hamon
Abstract	Patients and their families often require a better understanding of medical information provided by doctors. We currently address this issue by improving the identification of difficult to understand medical words. We introduce novel embeddings received from RNN - FrnnMUTE (French RNN Medical Understandability Text Embeddings) which allow to reach up to 87.0 F1 score in identification of difficult words. We also note that adding pre-trained FastText word embeddings to the feature set substantially improves the performance of the model which classifies words according to their difficulty. We study the generalizability of different models through three cross-validation scenarios which allow testing classifiers in real-world conditions: understanding of medical words by new users, and classification of new unseen words by the automatic models. The RNN - FrnnMUTE embeddings and the categorization code are being made available for the research.
Tasks	Word Embeddings
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5011/
PDF	https://www.aclweb.org/anthology/W19-5011
PWC	https://paperswithcode.com/paper/rnn-embeddings-for-identifying-difficult-to
Repo	https://github.com/hpylieva/FrnnMUTE
Framework	none

Putting Evaluation in Context: Contextual Embeddings Improve Machine Translation Evaluation


Title	Putting Evaluation in Context: Contextual Embeddings Improve Machine Translation Evaluation
Authors	Nitika Mathur, Timothy Baldwin, Trevor Cohn
Abstract	Accurate, automatic evaluation of machine translation is critical for system tuning, and evaluating progress in the field. We proposed a simple unsupervised metric, and additional supervised metrics which rely on contextual word embeddings to encode the translation and reference sentences. We find that these models rival or surpass all existing metrics in the WMT 2017 sentence-level and system-level tracks, and our trained model has a substantially higher correlation with human judgements than all existing metrics on the WMT 2017 to-English sentence level dataset.
Tasks	Machine Translation, Word Embeddings
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1269/
PDF	https://www.aclweb.org/anthology/P19-1269
PWC	https://paperswithcode.com/paper/putting-evaluation-in-context-contextual
Repo	https://github.com/nitikam/mteval-in-context
Framework	none

Globally Optimal Learning for Structured Elliptical Losses


Title	Globally Optimal Learning for Structured Elliptical Losses
Authors	Yoav Wald, Nofar Noy, Gal Elidan, Ami Wiesel
Abstract	Heavy tailed and contaminated data are common in various applications of machine learning. A standard technique to handle regression tasks that involve such data, is to use robust losses, e.g., the popular Huber’s loss. In structured problems, however, where there are multiple labels and structural constraints on the labels are imposed (or learned), robust optimization is challenging, and more often than not the loss used is simply the negative log-likelihood of a Gaussian Markov random field. Heavy tailed and contaminated data are common in various applications of machine learning. A standard technique to handle regression tasks that involve such data, is to use robust losses, e.g., the popular Huber’s loss. In structured problems, however, where there are multiple labels and structural constraints on the labels are imposed (or learned), robust optimization is challenging, and more often than not the loss used is simply the negative log-likelihood of a Gaussian Markov random field. In this work, we analyze robust alternatives. Theoretical understanding of such problems is quite limited, with guarantees on optimization given only for special cases and non-structured settings. The core of the difficulty is the non-convexity of the objective function, implying that standard optimization algorithms may converge to sub-optimal critical points. Our analysis focuses on loss functions that arise from elliptical distributions, which appealingly include most loss functions proposed in the literature as special cases. We show that, even though these problems are non-convex, they can be optimized efficiently. Concretely, we prove that at the limit of infinite training data, due to algebraic properties of the problem, all stationary points are globally optimal. Finally, we demonstrate the empirical appeal of using these losses for regression on synthetic and real-life data.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9504-globally-optimal-learning-for-structured-elliptical-losses
PDF	http://papers.nips.cc/paper/9504-globally-optimal-learning-for-structured-elliptical-losses.pdf
PWC	https://paperswithcode.com/paper/globally-optimal-learning-for-structured
Repo	https://github.com/yowald/elliptical-losses
Framework	tf

Trivializations for Gradient-Based Optimization on Manifolds


Title	Trivializations for Gradient-Based Optimization on Manifolds
Authors	Mario Lezcano Casado
Abstract	We introduce a framework to study the transformation of problems with manifold constraints into unconstrained problems through parametrizations in terms of a Euclidean space. We call these parametrizations trivializations. We prove conditions under which a trivialization is sound in the context of gradient-based optimization and we show how two large families of trivializations have overall favorable properties, but also suffer from a performance issue. We then introduce dynamic trivializations, which solve this problem, and we show how these form a family of optimization methods that lie between trivializations and Riemannian gradient descent, and combine the benefits of both of them. We then show how to implement these two families of trivializations in practice for different matrix manifolds. To this end, we prove a formula for the gradient of the exponential of matrices, which can be of practical interest on its own. Finally, we show how dynamic trivializations improve the performance of existing methods on standard tasks designed to test long-term memory within neural networks.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9115-trivializations-for-gradient-based-optimization-on-manifolds
PDF	http://papers.nips.cc/paper/9115-trivializations-for-gradient-based-optimization-on-manifolds.pdf
PWC	https://paperswithcode.com/paper/trivializations-for-gradient-based-1
Repo	https://github.com/Lezcano/expRNN
Framework	pytorch

Learning to Learn By Self-Critique


Title	Learning to Learn By Self-Critique
Authors	Antreas Antoniou, Amos J. Storkey
Abstract	In few-shot learning, a machine learning system is required to learn from a small set of labelled examples of a specific task, such that it can achieve strong generalization on new unlabelled examples of the same task. Given the limited availability of labelled examples in such tasks, we need to make use of all the information we can. For this reason we propose the use of transductive meta-learning for few shot settings to obtain state-of-the-art few-shot learning. Usually a model learns task-specific information from a small training-set (the \emph{support-set}) and subsequently produces predictions on a small unlabelled validation set (\emph{target-set}). The target-set contains additional task-specific information which is not utilized by existing few-shot learning methods. This is a challenge requiring approaches beyond the current methods as at inference time, the target-set contains only input data-points, and so discriminative-based learning cannot be used. In this paper, we propose a framework called \emph{Self-Critique and Adapt} or SCA. This approach learns to learn a label-free loss function, parameterized as a neural network, which leverages target-set information. A base-model learns on a support-set using existing methods (e.g. stochastic gradient descent combined with the cross-entropy loss), and then is updated for the incoming target-task using a new learned loss function (i.e. the meta-learned label-free loss). This unsupervised loss function is optimized such that the learnt model achieves higher generalization performance. Experiments demonstrate that SCA offers substantially higher and state-of-the-art generalization performance compared to baselines which only adapt on the support-set.
Tasks	Few-Shot Learning, Meta-Learning
Published	2019-12-01
URL	http://papers.nips.cc/paper/9185-learning-to-learn-by-self-critique
PDF	http://papers.nips.cc/paper/9185-learning-to-learn-by-self-critique.pdf
PWC	https://paperswithcode.com/paper/learning-to-learn-by-self-critique-1
Repo	https://github.com/AntreasAntoniou/Learning_to_Learn_via_Self-Critique
Framework	pytorch

On Relating Explanations and Adversarial Examples


Title	On Relating Explanations and Adversarial Examples
Authors	Alexey Ignatiev, Nina Narodytska, Joao Marques-Silva
Abstract	The importance of explanations (XP’s) of machine learning (ML) model predictions and of adversarial examples (AE’s) cannot be overstated, with both arguably being essential for the practical success of ML in different settings. There has been recent work on understanding and assessing the relationship between XP’s and AE’s. However, such work has been mostly experimental and a sound theoretical relationship has been elusive. This paper demonstrates that explanations and adversarial examples are related by a generalized form of hitting set duality, which extends earlier work on hitting set duality observed in model-based diagnosis and knowledge compilation. Furthermore, the paper proposes algorithms, which enable computing adversarial examples from explanations and vice-versa.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9717-on-relating-explanations-and-adversarial-examples
PDF	http://papers.nips.cc/paper/9717-on-relating-explanations-and-adversarial-examples.pdf
PWC	https://paperswithcode.com/paper/on-relating-explanations-and-adversarial
Repo	https://github.com/alexeyignatiev/xpce-duality
Framework	none