Paper Group NANR 22
Practical Two-Step Lookahead Bayesian Optimization. Co-Occurrent Features in Semantic Segmentation. Enhancing Dialogue Symptom Diagnosis with Global Attention and Symptom Graph. Invariance and Inverse Stability under ReLU. On the Word Alignment from Neural Machine Translation. S2GAN: Share Aging Factors Across Ages and Share Aging Trends Among Indi …
Practical Two-Step Lookahead Bayesian Optimization
Title | Practical Two-Step Lookahead Bayesian Optimization |
Authors | Jian Wu, Peter Frazier |
Abstract | Expected improvement and other acquisition functions widely used in Bayesian optimization use a “one-step” assumption: they value objective function evaluations assuming no future evaluations will be performed. Because we usually evaluate over multiple steps, this assumption may leave substantial room for improvement. Existing theory gives acquisition functions looking multiple steps in the future but calculating them requires solving a high-dimensional continuous-state continuous-action Markov decision process (MDP). Fast exact solutions of this MDP remain out of reach of today’s methods. As a result, previous two- and multi-step lookahead Bayesian optimization algorithms are either too expensive to implement in most practical settings or resort to heuristics that may fail to fully realize the promise of two-step lookahead. This paper proposes a computationally efficient algorithm that provides an accurate solution to the two-step lookahead Bayesian optimization problem in seconds to at most several minutes of computation per batch of evaluations. The resulting acquisition function provides increased query efficiency and robustness compared with previous two- and multi-step lookahead methods in both single-threaded and batch experiments. This unlocks the value of two-step lookahead in practice. We demonstrate the value of our algorithm with extensive experiments on synthetic test functions and real-world problems. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9174-practical-two-step-lookahead-bayesian-optimization |
http://papers.nips.cc/paper/9174-practical-two-step-lookahead-bayesian-optimization.pdf | |
PWC | https://paperswithcode.com/paper/practical-two-step-lookahead-bayesian |
Repo | |
Framework | |
Co-Occurrent Features in Semantic Segmentation
Title | Co-Occurrent Features in Semantic Segmentation |
Authors | Hang Zhang, Han Zhang, Chenguang Wang, Junyuan Xie |
Abstract | Recent work has achieved great success in utilizing global contextual information for semantic segmentation, including increasing the receptive field and aggregating pyramid feature representations. In this paper, we go beyond global context and explore the fine-grained representation using co-occurrent features by introducing Co-occurrent Feature Model, which predicts the distribution of co-occurrent features for a given target. To leverage the semantic context in the co-occurrent features, we build an Aggregated Co-occurrent Feature (ACF) Module by aggregating the probability of the co-occurrent feature with the co-occurrent context. ACF Module learns a fine-grained spatial invariant representation to capture co-occurrent context information across the scene. Our approach significantly improves the segmentation results using FCN and achieves superior performance 54.0% mIoU on Pascal Context, 87.2% mIoU on Pascal VOC 2012 and 44.89% mIoU on ADE20K datasets. The source code and complete system will be publicly available upon publication. |
Tasks | Semantic Segmentation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_Co-Occurrent_Features_in_Semantic_Segmentation_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_Co-Occurrent_Features_in_Semantic_Segmentation_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/co-occurrent-features-in-semantic |
Repo | |
Framework | |
Enhancing Dialogue Symptom Diagnosis with Global Attention and Symptom Graph
Title | Enhancing Dialogue Symptom Diagnosis with Global Attention and Symptom Graph |
Authors | Xinzhu Lin, Xiahui He, Qin Chen, Huaixiao Tou, Zhongyu Wei, Ting Chen |
Abstract | Symptom diagnosis is a challenging yet profound problem in natural language processing. Most previous research focus on investigating the standard electronic medical records for symptom diagnosis, while the dialogues between doctors and patients that contain more rich information are not well studied. In this paper, we first construct a dialogue symptom diagnosis dataset based on an online medical forum with a large amount of dialogues between patients and doctors. Then, we provide some benchmark models on this dataset to boost the research of dialogue symptom diagnosis. In order to further enhance the performance of symptom diagnosis over dialogues, we propose a global attention mechanism to capture more symptom related information, and build a symptom graph to model the associations between symptoms rather than treating each symptom independently. Experimental results show that both the global attention and symptom graph are effective to boost dialogue symptom diagnosis. In particular, our proposed model achieves the state-of-the-art performance on the constructed dataset. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1508/ |
https://www.aclweb.org/anthology/D19-1508 | |
PWC | https://paperswithcode.com/paper/enhancing-dialogue-symptom-diagnosis-with |
Repo | |
Framework | |
Invariance and Inverse Stability under ReLU
Title | Invariance and Inverse Stability under ReLU |
Authors | Jens Behrmann, Sören Dittmer, Pascal Fernsel, Peter Maass |
Abstract | We flip the usual approach to study invariance and robustness of neural networks by considering the non-uniqueness and instability of the inverse mapping. We provide theoretical and numerical results on the inverse of ReLU-layers. First, we derive a necessary and sufficient condition on the existence of invariance that provides a geometric interpretation. Next, we move to robustness via analyzing local effects on the inverse. To conclude, we show how this reverse point of view not only provides insights into key effects, but also enables to view adversarial examples from different perspectives. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SyxYEoA5FX |
https://openreview.net/pdf?id=SyxYEoA5FX | |
PWC | https://paperswithcode.com/paper/invariance-and-inverse-stability-under-relu |
Repo | |
Framework | |
On the Word Alignment from Neural Machine Translation
Title | On the Word Alignment from Neural Machine Translation |
Authors | Xintong Li, Guanlin Li, Lemao Liu, Max Meng, Shuming Shi |
Abstract | Prior researches suggest that neural machine translation (NMT) captures word alignment through its attention mechanism, however, this paper finds attention may almost fail to capture word alignment for some NMT models. This paper thereby proposes two methods to induce word alignment which are general and agnostic to specific NMT models. Experiments show that both methods induce much better word alignment than attention. This paper further visualizes the translation through the word alignment induced by NMT. In particular, it analyzes the effect of alignment errors on translation errors at word level and its quantitative analysis over many testing examples consistently demonstrate that alignment errors are likely to lead to translation errors measured by different metrics. |
Tasks | Machine Translation, Word Alignment |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1124/ |
https://www.aclweb.org/anthology/P19-1124 | |
PWC | https://paperswithcode.com/paper/on-the-word-alignment-from-neural-machine |
Repo | |
Framework | |
S2GAN: Share Aging Factors Across Ages and Share Aging Trends Among Individuals
Title | S2GAN: Share Aging Factors Across Ages and Share Aging Trends Among Individuals |
Authors | Zhenliang He, Meina Kan, Shiguang Shan, Xilin Chen |
Abstract | Generally, we human follow the roughly common aging trends, e.g., the wrinkles only tend to be more, longer or deeper. However, the aging process of each individual is more dominated by his/her personalized factors, including the invariant factors such as identity and mole, as well as the personalized aging patterns, e.g., one may age by graying hair while another may age by receding hairline. Following this biological principle, in this work, we propose an effective and efficient method to simulate natural aging. Specifically, a personalized aging basis is established for each individual to depict his/her own aging factors. Then different ages share this basis, being derived through age-specific transforms. The age-specific transforms represent the aging trends which are shared among all individuals. The proposed method can achieve continuous face aging with favorable aging accuracy, identity preservation, and fidelity. Furthermore, befitted from the effective design, a unique model is capable of all ages and the prediction time is significantly saved. |
Tasks | |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/He_S2GAN_Share_Aging_Factors_Across_Ages_and_Share_Aging_Trends_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/He_S2GAN_Share_Aging_Factors_Across_Ages_and_Share_Aging_Trends_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/s2gan-share-aging-factors-across-ages-and |
Repo | |
Framework | |
An Inexact Augmented Lagrangian Framework for Nonconvex Optimization with Nonlinear Constraints
Title | An Inexact Augmented Lagrangian Framework for Nonconvex Optimization with Nonlinear Constraints |
Authors | Mehmet Fatih Sahin, Armin Eftekhari, Ahmet Alacaoglu, Fabian Latorre, Volkan Cevher |
Abstract | We propose a practical inexact augmented Lagrangian method (iALM) for nonconvex problems with nonlinear constraints. We characterize the total computational complexity of our method subject to a verifiable geometric condition, which is closely related to the Polyak-Lojasiewicz and Mangasarian-Fromowitz conditions. In particular, when a first-order solver is used for the inner iterates, we prove that iALM finds a first-order stationary point with $\tilde{\mathcal{O}}(1/\epsilon^3)$ calls to the first-order oracle. {If, in addition, the problem is smooth and} a second-order solver is used for the inner iterates, iALM finds a second-order stationary point with $\tilde{\mathcal{O}}(1/\epsilon^5)$ calls to the second-order oracle. These complexity results match the known theoretical results in the literature. We also provide strong numerical evidence on large-scale machine learning problems, including the Burer-Monteiro factorization of semidefinite programs, and a novel nonconvex relaxation of the standard basis pursuit template. For these examples, we also show how to verify our geometric condition. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9545-an-inexact-augmented-lagrangian-framework-for-nonconvex-optimization-with-nonlinear-constraints |
http://papers.nips.cc/paper/9545-an-inexact-augmented-lagrangian-framework-for-nonconvex-optimization-with-nonlinear-constraints.pdf | |
PWC | https://paperswithcode.com/paper/an-inexact-augmented-lagrangian-framework-for |
Repo | |
Framework | |
Modeling Personalization in Continuous Space for Response Generation via Augmented Wasserstein Autoencoders
Title | Modeling Personalization in Continuous Space for Response Generation via Augmented Wasserstein Autoencoders |
Authors | Zhangming Chan, Juntao Li, Xiaopeng Yang, Xiuying Chen, Wenpeng Hu, Dongyan Zhao, Rui Yan |
Abstract | Variational autoencoders (VAEs) and Wasserstein autoencoders (WAEs) have achieved noticeable progress in open-domain response generation. Through introducing latent variables in continuous space, these models are capable of capturing utterance-level semantics, e.g., topic, syntactic properties, and thus can generate informative and diversified responses. In this work, we improve the WAE for response generation. In addition to the utterance-level information, we also model user-level information in latent continue space. Specifically, we embed user-level and utterance-level information into two multimodal distributions, and combine these two multimodal distributions into a mixed distribution. This mixed distribution will be used as the prior distribution of WAE in our proposed model, named as PersonaWAE. Experimental results on a large-scale real-world dataset confirm the superiority of our model for generating informative and personalized responses, where both automatic and human evaluations outperform state-of-the-art models. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1201/ |
https://www.aclweb.org/anthology/D19-1201 | |
PWC | https://paperswithcode.com/paper/modeling-personalization-in-continuous-space |
Repo | |
Framework | |
Using Attention-based Bidirectional LSTM to Identify Different Categories of Offensive Language Directed Toward Female Celebrities
Title | Using Attention-based Bidirectional LSTM to Identify Different Categories of Offensive Language Directed Toward Female Celebrities |
Authors | Sima Sharifirad, Stan Matwin |
Abstract | Social media posts reflect the emotions, intentions and mental state of the users. Twitter users who harass famous female figures may do so with different intentions and intensities. Recent studies have published datasets focusing on different types of online harassment, vulgar language, and emotional intensities. We trained, validate and test our proposed model, attention-based bidirectional neural network, on the three datasets:{''}online harassment{''}, {}vulgar language{''} and { }valance{''} and achieved state of the art performance in two of the datasets. We report F1 score for each dataset separately along with the final precision, recall and macro-averaged F1 score. In addition, we identify ten female figures from different professions and racial backgrounds who have experienced harassment on Twitter. We tested the trained models on ten collected corpuses each related to one famous female figure to predict the type of harassing language, the type of vulgar language and the degree of intensity of language occurring on their social platforms. Interestingly, the achieved results show different patterns of linguistic use targeting different racial background and occupations. The contribution of this study is two-fold. From the technical perspective, our proposed methodology is shown to be effective with a good margin in comparison to the previous state-of-the-art results on one of the two available datasets. From the social perspective, we introduce a methodology which can unlock facts about the nature of offensive language targeting women on online social platforms. The collected dataset will be shared publicly for further investigation. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/papers/W/W19/W19-3616/ |
https://www.aclweb.org/anthology/W19-3616 | |
PWC | https://paperswithcode.com/paper/using-attention-based-bidirectional-lstm-to |
Repo | |
Framework | |
VSP at PharmaCoNER 2019: Recognition of Pharmacological Substances, Compounds and Proteins with Recurrent Neural Networks in Spanish Clinical Cases
Title | VSP at PharmaCoNER 2019: Recognition of Pharmacological Substances, Compounds and Proteins with Recurrent Neural Networks in Spanish Clinical Cases |
Authors | V{'\i}ctor Su{'a}rez-Paniagua |
Abstract | This paper presents the participation of the VSP team for the PharmaCoNER Tracks from the BioNLP Open Shared Task 2019. The system consists of a neural model for the Named Entity Recognition of drugs, medications and chemical entities in Spanish and the use of the Spanish Edition of SNOMED CT term search engine for the concept normalization of the recognized mentions. The neural network is implemented with two bidirectional Recurrent Neural Networks with LSTM cells that creates a feature vector for each word of the sentences in order to classify the entities. The first layer uses the characters of each word and the resulting vector is aggregated to the second layer together with its word embedding in order to create the feature vector of the word. Besides, a Conditional Random Field layer classifies the vector representation of each word in one of the mention types. The system obtains a performance of 76.29{%}, and 60.34{%} in F1 for the classification of the Named Entity Recognition task and the Concept indexing task, respectively. This method presents good results with a basic approach without using pretrained word embeddings or any hand-crafted features. |
Tasks | Named Entity Recognition, Word Embeddings |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5703/ |
https://www.aclweb.org/anthology/D19-5703 | |
PWC | https://paperswithcode.com/paper/vsp-at-pharmaconer-2019-recognition-of |
Repo | |
Framework | |
Learning Representations of Categorical Feature Combinations via Self-Attention
Title | Learning Representations of Categorical Feature Combinations via Self-Attention |
Authors | Chen Xu, Chengzhen Fu, Peng Jiang, Wenwu Ou |
Abstract | Self-attention has been widely used to model the sequential data and achieved remarkable results in many applications. Although it can be used to model dependencies without regard to positions of sequences, self-attention is seldom applied to non-sequential data. In this work, we propose to learn representations of multi-field categorical data in prediction tasks via self-attention mechanism, where features are orderless but have intrinsic relations over different fields. In most current DNN based models, feature embeddings are simply concatenated for further processing by networks. Instead, by applying self-attention to transform the embeddings, we are able to relate features in different fields and automatically learn representations of their combinations, which are known as the factors of many prevailing linear models. To further improve the effect of feature combination mining, we modify the original self-attention structure by restricting the similarity weight to have at most k non-zero values, which additionally regularizes the model. We experimentally evaluate the effectiveness of our self-attention model on non-sequential data. Across two click through rate prediction benchmark datasets, i.e., Cretio and Avazu, our model with top-k restricted self-attention achieves the state-of-the-art performance. Compared with the vanilla MLP, the gain by adding self-attention is significantly larger than that by modifying the network structures, which most current works focus on. |
Tasks | Click-Through Rate Prediction |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SyxwW2A5Km |
https://openreview.net/pdf?id=SyxwW2A5Km | |
PWC | https://paperswithcode.com/paper/learning-representations-of-categorical |
Repo | |
Framework | |
ROBUST ESTIMATION VIA GENERATIVE ADVERSARIAL NETWORKS
Title | ROBUST ESTIMATION VIA GENERATIVE ADVERSARIAL NETWORKS |
Authors | Chao GAO, jiyi LIU, Yuan YAO, Weizhi ZHU |
Abstract | Robust estimation under Huber’s $\epsilon$-contamination model has become an important topic in statistics and theoretical computer science. Rate-optimal procedures such as Tukey’s median and other estimators based on statistical depth functions are impractical because of their computational intractability. In this paper, we establish an intriguing connection between f-GANs and various depth functions through the lens of f-Learning. Similar to the derivation of f-GAN, we show that these depth functions that lead to rate-optimal robust estimators can all be viewed as variational lower bounds of the total variation distance in the framework of f-Learning. This connection opens the door of computing robust estimators using tools developed for training GANs. In particular, we show that a JS-GAN that uses a neural network discriminator with at least one hidden layer is able to achieve the minimax rate of robust mean estimation under Huber’s $\epsilon$-contamination model. Interestingly, the hidden layers of the neural net structure in the discriminator class are shown to be necessary for robust estimation. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJgRDjR9tQ |
https://openreview.net/pdf?id=BJgRDjR9tQ | |
PWC | https://paperswithcode.com/paper/robust-estimation-via-generative-adversarial |
Repo | |
Framework | |
An Empirical Study of Span Representations in Argumentation Structure Parsing
Title | An Empirical Study of Span Representations in Argumentation Structure Parsing |
Authors | Tatsuki Kuribayashi, Hiroki Ouchi, Naoya Inoue, Paul Reisert, Toshinori Miyoshi, Jun Suzuki, Kentaro Inui |
Abstract | For several natural language processing (NLP) tasks, span representation design is attracting considerable attention as a promising new technique; a common basis for an effective design has been established. With such basis, exploring task-dependent extensions for argumentation structure parsing (ASP) becomes an interesting research direction. This study investigates (i) span representation originally developed for other NLP tasks and (ii) a simple task-dependent extension for ASP. Our extensive experiments and analysis show that these representations yield high performance for ASP and provide some challenging types of instances to be parsed. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1464/ |
https://www.aclweb.org/anthology/P19-1464 | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-span-representations-in |
Repo | |
Framework | |
Content-Aware Multi-Level Guidance for Interactive Instance Segmentation
Title | Content-Aware Multi-Level Guidance for Interactive Instance Segmentation |
Authors | Soumajit Majumder, Angela Yao |
Abstract | In interactive instance segmentation, users give feedback to iteratively refine segmentation masks. The user-provided clicks are transformed into guidance maps which provide the network with necessary cues on the whereabouts of the object of interest. Guidance maps used in current systems are purely distance-based and are either too localized or non-informative. We propose a novel transformation of user clicks to generate content-aware guidance maps that leverage the hierarchical structural information present in an image. Using our guidance maps, even the most basic FCNs are able to outperform existing approaches that require state-of-the-art segmentation networks pre-trained on large scale segmentation datasets. We demonstrate the effectiveness of our proposed transformation strategy through comprehensive experimentation in which we significantly raise state-of-the-art on four standard interactive segmentation benchmarks. |
Tasks | Instance Segmentation, Interactive Segmentation, Semantic Segmentation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Majumder_Content-Aware_Multi-Level_Guidance_for_Interactive_Instance_Segmentation_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Majumder_Content-Aware_Multi-Level_Guidance_for_Interactive_Instance_Segmentation_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/content-aware-multi-level-guidance-for |
Repo | |
Framework | |
Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling
Title | Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling |
Authors | Linqing Liu, Wei Yang, Jinfeng Rao, Raphael Tang, Jimmy Lin |
Abstract | Semantic similarity modeling is central to many NLP problems such as natural language inference and question answering. Syntactic structures interact closely with semantics in learning compositional representations and alleviating long-range dependency issues. How-ever, such structure priors have not been well exploited in previous work for semantic mod-eling. To examine their effectiveness, we start with the Pairwise Word Interaction Model, one of the best models according to a recent reproducibility study, then introduce components for modeling context and structure using multi-layer BiLSTMs and TreeLSTMs. In addition, we introduce residual connections to the deep convolutional neural network component of the model. Extensive evaluations on eight benchmark datasets show that incorporating structural information contributes to consistent improvements over strong baselines. |
Tasks | Natural Language Inference, Question Answering, Semantic Similarity, Semantic Textual Similarity |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1114/ |
https://www.aclweb.org/anthology/D19-1114 | |
PWC | https://paperswithcode.com/paper/incorporating-contextual-and-syntactic |
Repo | |
Framework | |