Paper Group ANR 8
On Deep Set Learning and the Choice of Aggregations. Selective metamorphosis for growth modelling with applications to landmarks. SocialNLP EmotionX 2019 Challenge Overview: Predicting Emotions in Spoken Dialogues and Chats. Learning Hierarchical Priors in VAEs. Missing Movie Synergistic Completion across Multiple Isomeric Online Movie Knowledge Li …
On Deep Set Learning and the Choice of Aggregations
Title | On Deep Set Learning and the Choice of Aggregations |
Authors | Maximilian Soelch, Adnan Akhundov, Patrick van der Smagt, Justin Bayer |
Abstract | Recently, it has been shown that many functions on sets can be represented by sum decompositions. These decompositons easily lend themselves to neural approximations, extending the applicability of neural nets to set-valued inputs—Deep Set learning. This work investigates a core component of Deep Set architecture: aggregation functions. We suggest and examine alternatives to commonly used aggregation functions, including learnable recurrent aggregation functions. Empirically, we show that the Deep Set networks are highly sensitive to the choice of aggregation functions: beyond improved performance, we find that learnable aggregations lower hyper-parameter sensitivity and generalize better to out-of-distribution input size. |
Tasks | |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07348v1 |
http://arxiv.org/pdf/1903.07348v1.pdf | |
PWC | https://paperswithcode.com/paper/on-deep-set-learning-and-the-choice-of |
Repo | |
Framework | |
Selective metamorphosis for growth modelling with applications to landmarks
Title | Selective metamorphosis for growth modelling with applications to landmarks |
Authors | Andreas Bock, Alexis Arnaudon, Colin Cotter |
Abstract | We present a framework for shape matching in computational anatomy allowing users control of the degree to which the matching is diffeomorphic. This control is given as a function defined over the image and parameterises the template deformation. By modelling localised template deformation we have a mathematical description of growth only in specified parts of an image. The location can either be specified from prior knowledge of the growth location or learned from data. For simplicity, we consider landmark matching and infer the distribution of a finite dimensional parameterisation of the control via Markov chain Monte Carlo. Preliminary numerical results are shown and future paths of investigation are laid out. Well-posedness of this new problem is studied together with an analysis of the associated geodesic equations. |
Tasks | |
Published | 2019-01-08 |
URL | https://arxiv.org/abs/1901.02826v2 |
https://arxiv.org/pdf/1901.02826v2.pdf | |
PWC | https://paperswithcode.com/paper/selective-metamorphosis-for-growth-modelling |
Repo | |
Framework | |
SocialNLP EmotionX 2019 Challenge Overview: Predicting Emotions in Spoken Dialogues and Chats
Title | SocialNLP EmotionX 2019 Challenge Overview: Predicting Emotions in Spoken Dialogues and Chats |
Authors | Boaz Shmueli, Lun-Wei Ku |
Abstract | We present an overview of the EmotionX 2019 Challenge, held at the 7th International Workshop on Natural Language Processing for Social Media (SocialNLP), in conjunction with IJCAI 2019. The challenge entailed predicting emotions in spoken and chat-based dialogues using augmented EmotionLines datasets. EmotionLines contains two distinct datasets: the first includes excerpts from a US-based TV sitcom episode scripts (Friends) and the second contains online chats (EmotionPush). A total of thirty-six teams registered to participate in the challenge. Eleven of the teams successfully submitted their predictions performance evaluation. The top-scoring team achieved a micro-F1 score of 81.5% for the spoken-based dialogues (Friends) and 79.5% for the chat-based dialogues (EmotionPush). |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07734v2 |
https://arxiv.org/pdf/1909.07734v2.pdf | |
PWC | https://paperswithcode.com/paper/socialnlp-emotionx-2019-challenge-overview |
Repo | |
Framework | |
Learning Hierarchical Priors in VAEs
Title | Learning Hierarchical Priors in VAEs |
Authors | Alexej Klushyn, Nutan Chen, Richard Kurle, Botond Cseke, Patrick van der Smagt |
Abstract | We propose to learn a hierarchical prior in the context of variational autoencoders to avoid the over-regularisation resulting from a standard normal prior distribution. To incentivise an informative latent representation of the data, we formulate the learning problem as a constrained optimisation problem by extending the Taming VAEs framework to two-level hierarchical models. We introduce a graph-based interpolation method, which shows that the topology of the learned latent representation corresponds to the topology of the data manifold—and present several examples, where desired properties of latent representation such as smoothness and simple explanatory factors are learned by the prior. |
Tasks | Motion Capture |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.04982v5 |
https://arxiv.org/pdf/1905.04982v5.pdf | |
PWC | https://paperswithcode.com/paper/learning-hierarchical-priors-in-vaes |
Repo | |
Framework | |
Missing Movie Synergistic Completion across Multiple Isomeric Online Movie Knowledge Libraries
Title | Missing Movie Synergistic Completion across Multiple Isomeric Online Movie Knowledge Libraries |
Authors | Bowen Dong, Jiawei Zhang, Chenwei Zhang, Yang Yang, Philip S. Yu |
Abstract | Online knowledge libraries refer to the online data warehouses that systematically organize and categorize the knowledge-based information about different kinds of concepts and entities. In the era of big data, the setup of online knowledge libraries is an extremely challenging and laborious task, in terms of efforts, time and expense required in the completion of knowledge entities. Especially nowadays, a large number of new knowledge entities, like movies, are keeping on being produced and coming out at a continuously accelerating speed, which renders the knowledge library setup and completion problem more difficult to resolve manually. In this paper, we will take the online movie knowledge libraries as an example, and study the “Multiple aligned ISomeric Online Knowledge LIbraries Completion problem” (Miso-Klic) problem across multiple online knowledge libraries. Miso-Klic aims at identifying the missing entities for multiple knowledge libraries synergistically and ranking them for editing based on certain ranking criteria. To solve the problem, a thorough investigation of two isomeric online knowledge libraries, Douban and IMDB, have been carried out in this paper. Based on analyses results, a novel deep online knowledge library completion framework “Integrated Deep alignEd Auto-encoder” (IDEA) is introduced to solve the problem. By projecting the entities from multiple isomeric knowledge libraries to a shared feature space, IDEA solves the Miso-Klic problem via three steps: (1) entity feature space unification via embedding, (2) knowledge library fusion based missing entity identification, and (3) missing entity ranking. Extensive experiments done on the real-world online knowledge library dataset have demonstrated the effectiveness of IDEA in addressing the problem. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06365v2 |
https://arxiv.org/pdf/1905.06365v2.pdf | |
PWC | https://paperswithcode.com/paper/missing-movie-synergistic-completion-across |
Repo | |
Framework | |
MALA: Cross-Domain Dialogue Generation with Action Learning
Title | MALA: Cross-Domain Dialogue Generation with Action Learning |
Authors | Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang |
Abstract | Response generation for task-oriented dialogues involves two basic components: dialogue planning and surface realization. These two components, however, have a discrepancy in their objectives, i.e., task completion and language quality. To deal with such discrepancy, conditioned response generation has been introduced where the generation process is factorized into action decision and language generation via explicit action representations. To obtain action representations, recent studies learn latent actions in an unsupervised manner based on the utterance lexical similarity. Such an action learning approach is prone to diversities of language surfaces, which may impinge task completion and language quality. To address this issue, we propose multi-stage adaptive latent action learning (MALA) that learns semantic latent actions by distinguishing the effects of utterances on dialogue progress. We model the utterance effect using the transition of dialogue states caused by the utterance and develop a semantic similarity measurement that estimates whether utterances have similar effects. For learning semantic actions on domains without dialogue states, MsALA extends the semantic similarity measurement across domains progressively, i.e., from aligning shared actions to learning domain-specific actions. Experiments using multi-domain datasets, SMD and MultiWOZ, show that our proposed model achieves consistent improvements over the baselines models in terms of both task completion and language quality. |
Tasks | Dialogue Generation, Semantic Similarity, Semantic Textual Similarity, Text Generation |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08442v1 |
https://arxiv.org/pdf/1912.08442v1.pdf | |
PWC | https://paperswithcode.com/paper/mala-cross-domain-dialogue-generation-with |
Repo | |
Framework | |
EmpGAN: Multi-resolution Interactive Empathetic Dialogue Generation
Title | EmpGAN: Multi-resolution Interactive Empathetic Dialogue Generation |
Authors | Qintong Li, Hongshen Chen, Zhaochun Ren, Zhumin Chen, Zhaopeng Tu, Jun Ma |
Abstract | Conventional emotional dialogue system focuses on generating emotion-rich replies. Studies on emotional intelligence suggest that constructing a more empathetic dialogue system, which is sensitive to the users’ expressed emotion, is a crucial step towards a more humanized human-machine conversation. However, obstacles to establishing such an empathetic conversational system are still far beyond current progress: 1) Simply considering the sentence-level emotions while neglecting the more precise token-level emotions may lead to insufficient emotion perceptivity. 2) Merely relying on the dialogue history but overlooking the potential of user feedback for the generated responses further aggravates the insufficient emotion perceptivity deficiencies. To address the above challenges, we propose the EmpGAN, a multi-resolution adversarial empathetic dialogue generation model to generate more appropriate and empathetic responses. To capture the nuances of user feelings sufficiently, EmpGAN generates responses by jointly taking both the coarse-grained sentence-level and fine-grained token-level emotions into account. Moreover, an interactive adversarial learning framework is introduced to further identify whether the generated responses evoke emotion perceptivity in dialogues regarding both the dialogue history and user feedback. Experiments show that our model outperforms the state-of-the-art baseline by a significant margin in terms of both content quality as well as the emotion perceptivity. In particular, the distinctiveness on the DailyDialog dataset is increased up to 129%. |
Tasks | Dialogue Generation |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08698v1 |
https://arxiv.org/pdf/1911.08698v1.pdf | |
PWC | https://paperswithcode.com/paper/empgan-multi-resolution-interactive |
Repo | |
Framework | |
Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation
Title | Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation |
Authors | Emily Dinan, Angela Fan, Adina Williams, Jack Urbanek, Douwe Kiela, Jason Weston |
Abstract | Models often easily learn biases present in the training data, and their predictions directly reflect this bias. We analyze the presence of gender bias in dialogue and examine the subsequent effect on generative chitchat dialogue models. Based on this analysis, we propose a combination of three techniques to mitigate bias: counterfactual data augmentation, targeted data collection, and conditional training. We focus on the multi-player text-based fantasy adventure dataset LIGHT as a testbed for our work. LIGHT contains gender imbalance between male and female characters with around 1.6 times as many male characters, likely because it is entirely collected by crowdworkers and reflects common biases that exist in fantasy or medieval settings. We show that (i) our proposed techniques mitigate gender bias by balancing the genderedness of generated dialogue utterances; and (ii) they work particularly well in combination. Further, we show through various metrics—such as quantity of gendered words, a dialogue safety classifier, and human evaluation—that our models generate less gendered, but still engaging chitchat responses. |
Tasks | Data Augmentation, Dialogue Generation |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03842v1 |
https://arxiv.org/pdf/1911.03842v1.pdf | |
PWC | https://paperswithcode.com/paper/queens-are-powerful-too-mitigating-gender |
Repo | |
Framework | |
Statistical and Machine Learning-based Decision Techniques for Physical Layer Authentication
Title | Statistical and Machine Learning-based Decision Techniques for Physical Layer Authentication |
Authors | Linda Senigagliesi, Marco Baldi, Ennio Gambi |
Abstract | In this paper we assess the security performance of key-less physical layer authentication schemes in the case of time-varying fading channels, considering both partial and no channel state information (CSI) on the receiver’s side. We first present a generalization of a well-known protocol previously proposed for flat fading channels and we study different statistical decision methods and the corresponding optimal attack strategies in order to improve the authentication performance in the considered scenario. We then consider the application of machine learning techniques in the same setting, exploiting different one-class nearest neighbor (OCNN) classification algorithms. We observe that, under the same probability of false alarm, one-class classification (OCC) algorithms achieve the lowest probability of missed detection when a low spatial correlation exists between the main channel and the adversary one, while statistical methods are advantageous when the spatial correlation between the two channels is higher. |
Tasks | |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07969v1 |
https://arxiv.org/pdf/1909.07969v1.pdf | |
PWC | https://paperswithcode.com/paper/statistical-and-machine-learning-based |
Repo | |
Framework | |
Mixing Time Estimation in Ergodic Markov Chains from a Single Trajectory with Contraction Methods
Title | Mixing Time Estimation in Ergodic Markov Chains from a Single Trajectory with Contraction Methods |
Authors | Geoffrey Wolfer |
Abstract | The mixing time $t_{\mathsf{mix}}$ of an ergodic Markov chain measures the rate of convergence towards its stationary distribution $\boldsymbol{\pi}$. We consider the problem of estimating $t_{\mathsf{mix}}$ from one single trajectory of $m$ observations $(X_1, . . . , X_m)$, in the case where the transition kernel $\boldsymbol{M}$ is unknown, a research program started by Hsu et al. [2015]. The community has so far focused primarily on leveraging spectral methods to estimate the relaxation time $t_{\mathsf{rel}}$ of a reversible Markov chain as a proxy for $t_{\mathsf{mix}}$. Although these techniques have recently been extended to tackle non-reversible chains, this general setting remains much less understood. Our new approach based on contraction methods is the first that aims at directly estimating $t_{\mathsf{mix}}$ up to multiplicative small universal constants instead of $t_{\mathsf{rel}}$. It does so by introducing a generalized version of Dobrushin’s contraction coefficient $\kappa_{\mathsf{gen}}$, which is shown to control the mixing time regardless of reversibility. We subsequently design fully data-dependent high confidence intervals around $\kappa_{\mathsf{gen}}$ that generally yield better convergence guarantees and are more practical than state-of-the-art. |
Tasks | |
Published | 2019-12-14 |
URL | https://arxiv.org/abs/1912.06845v1 |
https://arxiv.org/pdf/1912.06845v1.pdf | |
PWC | https://paperswithcode.com/paper/mixing-time-estimation-in-ergodic-markov |
Repo | |
Framework | |
Paraphrasing Verbs for Noun Compound Interpretation
Title | Paraphrasing Verbs for Noun Compound Interpretation |
Authors | Preslav Nakov |
Abstract | An important challenge for the automatic analysis of English written text is the abundance of noun compounds: sequences of nouns acting as a single noun. In our view, their semantics is best characterized by the set of all possible paraphrasing verbs, with associated weights, e.g., malaria mosquito is carry (23), spread (16), cause (12), transmit (9), etc. Using Amazon’s Mechanical Turk, we collect paraphrasing verbs for 250 noun-noun compounds previously proposed in the linguistic literature, thus creating a valuable resource for noun compound interpretation. Using these verbs, we further construct a dataset of pairs of sentences representing a special kind of textual entailment task, where a binary decision is to be made about whether an expression involving a verb and two nouns can be transformed into a noun compound, while preserving the sentence meaning. |
Tasks | Natural Language Inference |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08762v1 |
https://arxiv.org/pdf/1911.08762v1.pdf | |
PWC | https://paperswithcode.com/paper/paraphrasing-verbs-for-noun-compound |
Repo | |
Framework | |
Illuminant Chromaticity Estimation from Interreflections
Title | Illuminant Chromaticity Estimation from Interreflections |
Authors | Eytan Lifshitz, Dani Lischinski |
Abstract | Reliable estimation of illuminant chromaticity is crucial for simulating color constancy and for white balancing digital images. However, estimating illuminant chromaticity from a single image is an ill-posed task, in general, and existing solutions typically employ a variety of assumptions and heuristics. In this paper, we present a new, physically-based, approach for estimating illuminant chromaticity from interreflections of light between diffuse surfaces. Our approach assumes that all of the direct illumination in the scene has the same chromaticity, and that at least two areas where interreflections between Lambertian surfaces occur may be detected in the image. No further assumptions or restrictions on the illuminant chromaticty or the shading in the scene are necessary. Our approach is based on representing interreflections as lines in a special 2D color space, and the chromaticity of the illuminant is estimated from the approximate intersection between two or more such lines. Experimental results are reported on a dataset of illumination and surface reflectance spectra, as well as on real images we captured. The results indicate that our approach can yield state-of-the-art results when the interreflections are significant enough to be captured by the camera. |
Tasks | Color Constancy |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05526v1 |
https://arxiv.org/pdf/1906.05526v1.pdf | |
PWC | https://paperswithcode.com/paper/illuminant-chromaticity-estimation-from |
Repo | |
Framework | |
Depth-Width Trade-offs for ReLU Networks via Sharkovsky’s Theorem
Title | Depth-Width Trade-offs for ReLU Networks via Sharkovsky’s Theorem |
Authors | Vaggos Chatziafratis, Sai Ganesh Nagarajan, Ioannis Panageas, Xiao Wang |
Abstract | Understanding the representational power of Deep Neural Networks (DNNs) and how their structural properties (e.g., depth, width, type of activation unit) affect the functions they can compute, has been an important yet challenging question in deep learning and approximation theory. In a seminal paper, Telgarsky highlighted the benefits of depth by presenting a family of functions (based on simple triangular waves) for which DNNs achieve zero classification error, whereas shallow networks with fewer than exponentially many nodes incur constant error. Even though Telgarsky’s work reveals the limitations of shallow neural networks, it does not inform us on why these functions are difficult to represent and in fact he states it as a tantalizing open question to characterize those functions that cannot be well-approximated by smaller depths. In this work, we point to a new connection between DNNs expressivity and Sharkovsky’s Theorem from dynamical systems, that enables us to characterize the depth-width trade-offs of ReLU networks for representing functions based on the presence of generalized notion of fixed points, called periodic points (a fixed point is a point of period 1). Motivated by our observation that the triangle waves used in Telgarsky’s work contain points of period 3 - a period that is special in that it implies chaotic behavior based on the celebrated result by Li-Yorke - we proceed to give general lower bounds for the width needed to represent periodic functions as a function of the depth. Technically, the crux of our approach is based on an eigenvalue analysis of the dynamical system associated with such functions. |
Tasks | |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.04378v1 |
https://arxiv.org/pdf/1912.04378v1.pdf | |
PWC | https://paperswithcode.com/paper/depth-width-trade-offs-for-relu-networks-via-1 |
Repo | |
Framework | |
Linear Convergence of Adaptive Stochastic Gradient Descent
Title | Linear Convergence of Adaptive Stochastic Gradient Descent |
Authors | Yuege Xie, Xiaoxia Wu, Rachel Ward |
Abstract | We prove that the norm version of the adaptive stochastic gradient method (AdaGrad-Norm) achieves a linear convergence rate for a subset of either strongly convex functions or non-convex functions that satisfy the Polyak Lojasiewicz (PL) inequality. The paper introduces the notion of Restricted Uniform Inequality of Gradients (RUIG)—which is a measure of the balanced-ness of the stochastic gradient norms—to depict the landscape of a function. RUIG plays a key role in proving the robustness of AdaGrad-Norm to its hyper-parameter tuning in the stochastic setting. On top of RUIG, we develop a two-stage framework to prove the linear convergence of AdaGrad-Norm without knowing the parameters of the objective functions. This framework can likely be extended to other adaptive stepsize algorithms. The numerical experiments validate the theory and suggest future directions for improvement. |
Tasks | |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10525v2 |
https://arxiv.org/pdf/1908.10525v2.pdf | |
PWC | https://paperswithcode.com/paper/linear-convergence-of-adaptive-stochastic |
Repo | |
Framework | |
A Fully-Automatic Framework for Parkinson’s Disease Diagnosis by Multi-Modality Images
Title | A Fully-Automatic Framework for Parkinson’s Disease Diagnosis by Multi-Modality Images |
Authors | Jiahang Xu, Fangyang Jiao, Yechong Huang, Xinzhe Luo, Qian Xu, Ling Li, Xueling Liu, Chuantao Zuo, Ping Wu, Xiahai Zhuang |
Abstract | Background: Parkinson’s disease (PD) is a prevalent long-term neurodegenerative disease. Though the diagnostic criteria of PD are relatively well defined, the current medical imaging diagnostic procedures are expertise-demanding, and thus call for a higher-integrated AI-based diagnostic algorithm. Methods: In this paper, we proposed an automatic, end-to-end, multi-modality diagnosis framework, including segmentation, registration, feature generation and machine learning, to process the information of the striatum for the diagnosis of PD. Multiple modalities, including T1- weighted MRI and 11C-CFT PET, were used in the proposed framework. The reliability of this framework was then validated on a dataset from the PET center of Huashan Hospital, as the dataset contains paired T1-MRI and CFT-PET images of 18 Normal (NL) subjects and 49 PD subjects. Results: We obtained an accuracy of 100% for the PD/NL classification task, besides, we conducted several comparative experiments to validate the diagnosis ability of our framework. Conclusion: Through experiment we illustrate that (1) automatic segmentation has the same classification effect as the manual segmentation, (2) the multi-modality images generates a better prediction than single modality images, and (3) volume feature is shown to be irrelevant to PD diagnosis. |
Tasks | |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09934v1 |
http://arxiv.org/pdf/1902.09934v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fully-automatic-framework-for-parkinsons |
Repo | |
Framework | |