January 29, 2020

3312 words 16 mins read

Paper Group ANR 555

Paper Group ANR 555

Fast Regularity-Constrained Plane Reconstruction. Multi-Task Networks With Universe, Group, and Task Feature Learning. CLAREL: classification via retrieval loss for zero-shot learning. Generic Tracking and Probabilistic Prediction Framework and Its Application in Autonomous Driving. Adversarial Approximate Inference for Speech to Electroglottograph …

Fast Regularity-Constrained Plane Reconstruction

Title Fast Regularity-Constrained Plane Reconstruction
Authors Yangbin Lin, Jialian Li, Cheng Wang, Zhonggui Chen, Zongyue Wang, Jonathan Li
Abstract Man-made environments typically comprise planar structures that exhibit numerous geometric relationships, such as parallelism, coplanarity, and orthogonality. Making full use of these relationships can considerably improve the robustness of algorithmic plane reconstruction of complex scenes. This research leverages a constraint model requiring minimal prior knowledge to implicitly establish relationships among planes. We introduce a method based on energy minimization to reconstruct the planes consistent with our constraint model. The proposed algorithm is efficient, easily to understand, and simple to implement. The experimental results show that our algorithm successfully reconstructs planes under high percentages of noise and outliers. This is superior to other state-of-the-art regularity-constrained plane reconstruction methods in terms of speed and robustness.
Tasks
Published 2019-05-20
URL https://arxiv.org/abs/1905.07922v1
PDF https://arxiv.org/pdf/1905.07922v1.pdf
PWC https://paperswithcode.com/paper/fast-regularity-constrained-plane
Repo
Framework

Multi-Task Networks With Universe, Group, and Task Feature Learning

Title Multi-Task Networks With Universe, Group, and Task Feature Learning
Authors Shiva Pentyala, Mengwen Liu, Markus Dreyer
Abstract We present methods for multi-task learning that take advantage of natural groupings of related tasks. Task groups may be defined along known properties of the tasks, such as task domain or language. Such task groups represent supervised information at the inter-task level and can be encoded into the model. We investigate two variants of neural network architectures that accomplish this, learning different feature spaces at the levels of individual tasks, task groups, as well as the universe of all tasks: (1) parallel architectures encode each input simultaneously into feature spaces at different levels; (2) serial architectures encode each input successively into feature spaces at different levels in the task hierarchy. We demonstrate the methods on natural language understanding (NLU) tasks, where a grouping of tasks into different task domains leads to improved performance on ATIS, Snips, and a large inhouse dataset.
Tasks Multi-Task Learning
Published 2019-07-03
URL https://arxiv.org/abs/1907.01791v1
PDF https://arxiv.org/pdf/1907.01791v1.pdf
PWC https://paperswithcode.com/paper/multi-task-networks-with-universe-group-and
Repo
Framework

CLAREL: classification via retrieval loss for zero-shot learning

Title CLAREL: classification via retrieval loss for zero-shot learning
Authors Boris N. Oreshkin, Negar Rostamzadeh, Pedro O. Pinheiro, Christopher Pal
Abstract We address the problem of learning fine-grained cross-modal representations. We propose an instance-based deep metric learning approach in joint visual and textual space. The key novelty of this paper is that it shows that using per-image semantic supervision leads to substantial improvement in zero-shot performance over using class-only supervision. On top of that, we provide a probabilistic justification for a metric rescaling approach that solves a very common problem in the generalized zero-shot learning setting, i.e., classifying test images from unseen classes as one of the classes seen during training. We evaluate our approach on two fine-grained zero-shot learning datasets: CUB and FLOWERS. We find that on the generalized zero-shot classification task CLAREL consistently outperforms the existing approaches on both datasets.
Tasks Metric Learning, Zero-Shot Learning
Published 2019-05-31
URL https://arxiv.org/abs/1906.11892v2
PDF https://arxiv.org/pdf/1906.11892v2.pdf
PWC https://paperswithcode.com/paper/fine-grained-zero-shot-recognition-with
Repo
Framework

Generic Tracking and Probabilistic Prediction Framework and Its Application in Autonomous Driving

Title Generic Tracking and Probabilistic Prediction Framework and Its Application in Autonomous Driving
Authors Jiachen Li, Wei Zhan, Yeping Hu, Masayoshi Tomizuka
Abstract Accurately tracking and predicting behaviors of surrounding objects are key prerequisites for intelligent systems such as autonomous vehicles to achieve safe and high-quality decision making and motion planning. However, there still remain challenges for multi-target tracking due to object number fluctuation and occlusion. To overcome these challenges, we propose a constrained mixture sequential Monte Carlo (CMSMC) method in which a mixture representation is incorporated in the estimated posterior distribution to maintain multi-modality. Multiple targets can be tracked simultaneously within a unified framework without explicit data association between observations and tracking targets. The framework can incorporate an arbitrary prediction model as the implicit proposal distribution of the CMSMC method. An example in this paper is a learning-based model for hierarchical time-series prediction, which consists of a behavior recognition module and a state evolution module. Both modules in the proposed model are generic and flexible so as to be applied to a class of time-series prediction problems where behaviors can be separated into different levels. Finally, the proposed framework is applied to a numerical case study as well as a task of on-road vehicle tracking, behavior recognition, and prediction in highway scenarios. Instead of only focusing on forecasting trajectory of a single entity, we jointly predict continuous motions for interactive entities simultaneously. The proposed approaches are evaluated from multiple aspects, which demonstrate great potential for intelligent vehicular systems and traffic surveillance systems.
Tasks Autonomous Driving, Autonomous Vehicles, Decision Making, Motion Planning, Time Series, Time Series Prediction
Published 2019-08-23
URL https://arxiv.org/abs/1908.09031v1
PDF https://arxiv.org/pdf/1908.09031v1.pdf
PWC https://paperswithcode.com/paper/generic-tracking-and-probabilistic-prediction
Repo
Framework

Adversarial Approximate Inference for Speech to Electroglottograph Conversion

Title Adversarial Approximate Inference for Speech to Electroglottograph Conversion
Authors Prathosh A. P., Varun Srivastava, Mayank Mishra
Abstract Speech produced by human vocal apparatus conveys substantial non-semantic information including the gender of the speaker, voice quality, affective state, abnormalities in the vocal apparatus etc. Such information is attributed to the properties of the voice source signal, which is usually estimated from the speech signal. However, most of the source estimation techniques depend heavily on the goodness of the model assumptions and are prone to noise. A popular alternative is to indirectly obtain the source information through the Electroglottographic (EGG) signal that measures the electrical admittance around the vocal folds using dedicated hardware. In this paper, we address the problem of estimating the EGG signal directly from the speech signal, devoid of any hardware. Sampling from the intractable conditional distribution of the EGG signal given the speech signal is accomplished through optimization of an evidence lower bound. This is constructed via minimization of the KL-divergence between the true and the approximated posteriors of a latent variable learned using a deep neural auto-encoder that serves an informative prior. We demonstrate the efficacy of the method at generating the EGG signal by conducting several experiments on datasets comprising multiple speakers, voice qualities, noise settings and speech pathologies. The proposed method is evaluated on many benchmark metrics and is found to agree with the gold standard while proving better than the state-of-the-art algorithms on a few tasks such as epoch extraction.
Tasks
Published 2019-03-28
URL https://arxiv.org/abs/1903.12248v2
PDF https://arxiv.org/pdf/1903.12248v2.pdf
PWC https://paperswithcode.com/paper/adversarial-approximate-inference-for-speech
Repo
Framework
Title Collaborative Quantization for Cross-Modal Similarity Search
Authors Ting Zhang, Jingdong Wang
Abstract Cross-modal similarity search is a problem about designing a search system supporting querying across content modalities, e.g., using an image to search for texts or using a text to search for images. This paper presents a compact coding solution for efficient search, with a focus on the quantization approach which has already shown the superior performance over the hashing solutions in the single-modal similarity search. We propose a cross-modal quantization approach, which is among the early attempts to introduce quantization into cross-modal search. The major contribution lies in jointly learning the quantizers for both modalities through aligning the quantized representations for each pair of image and text belonging to a document. In addition, our approach simultaneously learns the common space for both modalities in which quantization is conducted to enable efficient and effective search using the Euclidean distance computed in the common space with fast distance table lookup. Experimental results compared with several competitive algorithms over three benchmark datasets demonstrate that the proposed approach achieves the state-of-the-art performance.
Tasks Quantization
Published 2019-02-02
URL http://arxiv.org/abs/1902.00623v1
PDF http://arxiv.org/pdf/1902.00623v1.pdf
PWC https://paperswithcode.com/paper/collaborative-quantization-for-cross-modal
Repo
Framework

Chaotic Time Series Prediction using Spatio-Temporal RBF Neural Networks

Title Chaotic Time Series Prediction using Spatio-Temporal RBF Neural Networks
Authors Alishba Sadiq, Muhammad Sohail Ibrahim, Muhammad Usman, Muhammad Zubair, Shujaat Khan
Abstract Due to the dynamic nature, chaotic time series are difficult predict. In conventional signal processing approaches signals are treated either in time or in space domain only. Spatio-temporal analysis of signal provides more advantages over conventional uni-dimensional approaches by harnessing the information from both the temporal and spatial domains. Herein, we propose an spatio-temporal extension of RBF neural networks for the prediction of chaotic time series. The proposed algorithm utilizes the concept of time-space orthogonality and separately deals with the temporal dynamics and spatial non-linearity(complexity) of the chaotic series. The proposed RBF architecture is explored for the prediction of Mackey-Glass time series and results are compared with the standard RBF. The spatio-temporal RBF is shown to out perform the standard RBFNN by achieving significantly reduced estimation error.
Tasks Time Series, Time Series Prediction
Published 2019-08-17
URL https://arxiv.org/abs/1908.08389v1
PDF https://arxiv.org/pdf/1908.08389v1.pdf
PWC https://paperswithcode.com/paper/chaotic-time-series-prediction-using-spatio
Repo
Framework

A Corpus Linguistic Analysis of Public Reddit Blog Posts on Non-Suicidal Self-Injury

Title A Corpus Linguistic Analysis of Public Reddit Blog Posts on Non-Suicidal Self-Injury
Authors Mandy M. Greaves, Cass Dykeman
Abstract While non-suicidal self-injury (NSSI) is not a new phenomenon, there is still a limited yet little is still known about understanding of the behavior, the intent behind the behavior and what the individuals themselves say about their behavior. This study collected pro-NSSI public blog posts from Reddit on pro-NSSI and analyzed the content linguistically using LIWC software, in order to examine the use of NSSI specific words, linguistic properties and the psychological linguistic properties. were examined. The results inform current counseling practices by dispelling myths and providing insight into the inner world of people who engage in use NSSII to cope. The most frequently appearing category of For NSSI specific words categories, in the Reddit blogs was the reasons in which one engagesfor engaging in NSSI was the most frequently used in the Reddit blogs. The linguistic properties found in the analysis reflected the predicted results; authors of pro-NSSI posts used demonstrated expected results of first-person singular pronouns extensively, which indicatesing high levels of mental health distress and isolation. The psychological linguistic properties that could be observed of in these public Reddit posts were dominantly in a negative emotional tone which demonstrates youth and impulsivity. The linguistic properties found when these posts were analyzed supports the work of earlier studies that dispelled common myths about NSSI that were circulating in the mental health community. These findings suggest that the language of people who engage in NSSI supports research findings in dispelling common myths about NSSI.
Tasks
Published 2019-02-02
URL http://arxiv.org/abs/1902.06689v1
PDF http://arxiv.org/pdf/1902.06689v1.pdf
PWC https://paperswithcode.com/paper/a-corpus-linguistic-analysis-of-public-reddit
Repo
Framework

Entertaining and Opinionated but Too Controlling: A Large-Scale User Study of an Open Domain Alexa Prize System

Title Entertaining and Opinionated but Too Controlling: A Large-Scale User Study of an Open Domain Alexa Prize System
Authors Kevin K. Bowden, Jiaqi Wu, Wen Cui, Juraj Juraska, Vrindavan Harrison, Brian Schwarzmann, Nicholas Santer, Steve Whittaker, Marilyn Walker
Abstract Conversational systems typically focus on functional tasks such as scheduling appointments or creating todo lists. Instead we design and evaluate SlugBot (SB), one of 8 semifinalists in the 2018 AlexaPrize, whose goal is to support casual open-domain social inter-action. This novel application requires both broad topic coverage and engaging interactive skills. We developed a new technical approach to meet this demanding situation by crowd-sourcing novel content and introducing playful conversational strategies based on storytelling and games. We collected over 10,000 conversations during August 2018 as part of the Alexa Prize competition. We also conducted an in-lab follow-up qualitative evaluation. Over-all users found SB moderately engaging; conversations averaged 3.6 minutes and involved 26 user turns. However, users reacted very differently to different conversation subtypes. Storytelling and games were evaluated positively; these were seen as entertaining with predictable interactive structure. They also led users to impute personality and intelligence to SB. In contrast, search and general Chit-Chat induced coverage problems; here users found it hard to infer what topics SB could understand, with these conversations seen as being too system-driven. Theoretical and design implications suggest a move away from conversational systems that simply provide factual information. Future systems should be designed to have their own opinions with personal stories to share, and SB provides an example of how we might achieve this.
Tasks
Published 2019-08-13
URL https://arxiv.org/abs/1908.04832v1
PDF https://arxiv.org/pdf/1908.04832v1.pdf
PWC https://paperswithcode.com/paper/entertaining-and-opinionated-but-too
Repo
Framework

Neural Logic Networks

Title Neural Logic Networks
Authors Shaoyun Shi, Hanxiong Chen, Min Zhang, Yongfeng Zhang
Abstract Recent years have witnessed the great success of deep neural networks in many research areas. The fundamental idea behind the design of most neural networks is to learn similarity patterns from data for prediction and inference, which lacks the ability of logical reasoning. However, the concrete ability of logical reasoning is critical to many theoretical and practical problems. In this paper, we propose Neural Logic Network (NLN), which is a dynamic neural architecture that builds the computational graph according to input logical expressions. It learns basic logical operations as neural modules, and conducts propositional logical reasoning through the network for inference. Experiments on simulated data show that NLN achieves significant performance on solving logical equations. Further experiments on real-world data show that NLN significantly outperforms state-of-the-art models on collaborative filtering and personalized recommendation tasks.
Tasks
Published 2019-10-17
URL https://arxiv.org/abs/1910.08629v1
PDF https://arxiv.org/pdf/1910.08629v1.pdf
PWC https://paperswithcode.com/paper/neural-logic-networks
Repo
Framework

Autoregressive-Model-Based Methods for Online Time Series Prediction with Missing Values: an Experimental Evaluation

Title Autoregressive-Model-Based Methods for Online Time Series Prediction with Missing Values: an Experimental Evaluation
Authors Xi Chen, Hongzhi Wang, Yanjie Wei, Jianzhong Li, Hong Gao
Abstract Time series prediction with missing values is an important problem of time series analysis since complete data is usually hard to obtain in many real-world applications. To model the generation of time series, autoregressive (AR) model is a basic and widely used one, which assumes that each observation in the time series is a noisy linear combination of some previous observations along with a constant shift. To tackle the problem of prediction with missing values, a number of methods were proposed based on various data models. For real application scenarios, how do these methods perform over different types of time series with different levels of data missing remains to be investigated. In this paper, we focus on online methods for AR-model-based time series prediction with missing values. We adapted five mainstream methods to fit in such a scenario. We make detailed discussion on each of them by introducing their core ideas about how to estimate the AR coefficients and their different strategies to deal with missing values. We also present algorithmic implementations for better understanding. In order to comprehensively evaluate these methods and do the comparison, we conduct experiments with various configurations of relative parameters over both synthetic and real data. From the experimental results, we derived several noteworthy conclusions and shows that imputation is a simple but reliable strategy to handle missing values in online prediction tasks.
Tasks Imputation, Time Series, Time Series Analysis, Time Series Prediction
Published 2019-08-10
URL https://arxiv.org/abs/1908.06729v2
PDF https://arxiv.org/pdf/1908.06729v2.pdf
PWC https://paperswithcode.com/paper/autoregressive-model-based-methods-for-online
Repo
Framework

Social Bias Frames: Reasoning about Social and Power Implications of Language

Title Social Bias Frames: Reasoning about Social and Power Implications of Language
Authors Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, Yejin Choi
Abstract Language has the power to reinforce stereotypes and project social biases onto others. At the core of the challenge is that it is rarely what is stated explicitly, but all the implied meanings that frame people’s judgements about others. For example, given a seemingly innocuous statement “we shouldn’t lower our standards to hire more women,” most listeners will infer the implicature intended by the speaker - that “women (candidates) are less qualified.” Most frame semantic formalisms, to date, do not capture such pragmatic frames in which people express social biases and power differentials in language. We introduce Social Bias Frames, a new conceptual formalism that aims to model the pragmatic frames in which people project social biases and stereotypes on others. In addition, we introduce the Social Bias Inference Corpus, to support large-scale modelling and evaluation with 100k structured annotations of social media posts, covering over 26k implications about a thousand demographic groups. We then establish baseline approaches that learn to recover Social Bias Frames from unstructured text. We find that while state-of-the-art neural models are effective at high-level categorization of whether a given statement projects unwanted social bias (86% F1), they are not effective at spelling out more detailed explanations by accurately decoding out Social Bias Frames. Our study motivates future research that combines structured pragmatic inference with commonsense reasoning on social implications.
Tasks
Published 2019-11-10
URL https://arxiv.org/abs/1911.03891v1
PDF https://arxiv.org/pdf/1911.03891v1.pdf
PWC https://paperswithcode.com/paper/social-bias-frames-reasoning-about-social-and
Repo
Framework

Effective Search of Logical Forms for Weakly Supervised Knowledge-Based Question Answering

Title Effective Search of Logical Forms for Weakly Supervised Knowledge-Based Question Answering
Authors Tao Shen, Xiubo Geng, Tao Qin, Guodong Long, Jing Jiang, Daxin Jiang
Abstract Many algorithms for Knowledge-Based Question Answering (KBQA) depend on semantic parsing, which translates a question to its logical form. When only weak supervision is provided, it is usually necessary to search valid logical forms for model training. However, a complex question typically involves a huge search space, which creates two main problems: 1) the solutions limited by computation time and memory usually reduce the success rate of the search, and 2) spurious logical forms in the search results degrade the quality of training data. These two problems lead to a poorly-trained semantic parsing model. In this work, we propose an effective search method for weakly supervised KBQA based on operator prediction for questions. With search space constrained by predicted operators, sufficient search paths can be explored, more valid logical forms can be derived, and operators possibly causing spurious logical forms can be avoided. As a result, a larger proportion of questions in a weakly supervised training set are equipped with logical forms, and fewer spurious logical forms are generated. Such high-quality training data directly contributes to a better semantic parsing model. Experimental results on one of the largest KBQA datasets (i.e., CSQA) verify the effectiveness of our approach: improving the precision from 67% to 72% and the recall from 67% to 72% in terms of the overall score.
Tasks Question Answering, Semantic Parsing
Published 2019-09-06
URL https://arxiv.org/abs/1909.02762v1
PDF https://arxiv.org/pdf/1909.02762v1.pdf
PWC https://paperswithcode.com/paper/effective-search-of-logical-forms-for-weakly
Repo
Framework

Evaluation of Deep Species Distribution Models using Environment and Co-occurrences

Title Evaluation of Deep Species Distribution Models using Environment and Co-occurrences
Authors Benjamin Deneu, Maximilien Servajean, Christophe Botella, Alexis Joly
Abstract This paper presents an evaluation of several approaches of plants species distribution modeling based on spatial, environmental and co-occurrences data using machine learning methods. In particular, we re-evaluate the environmental convolutional neural network model that obtained the best performance of the GeoLifeCLEF 2018 challenge but on a revised dataset that fixes some of the issues of the previous one. We also go deeper in the analysis of co-occurrences information by evaluating a new model that jointly takes environmental variables and co-occurrences as inputs of an end-to-end network. Results show that the environmental models are the best performing methods and that there is a significant amount of complementary information between co-occurrences and environment. Indeed, the model learned on both inputs allows a significant performance gain compared to the environmental model alone.
Tasks
Published 2019-09-19
URL https://arxiv.org/abs/1909.08825v1
PDF https://arxiv.org/pdf/1909.08825v1.pdf
PWC https://paperswithcode.com/paper/evaluation-of-deep-species-distribution
Repo
Framework

Adversarial Test on Learnable Image Encryption

Title Adversarial Test on Learnable Image Encryption
Authors MaungMaung AprilPyone, Warit Sirichotedumrong, Hitoshi Kiya
Abstract Data for deep learning should be protected for privacy preserving. Researchers have come up with the notion of learnable image encryption to satisfy the requirement. However, existing privacy preserving approaches have never considered the threat of adversarial attacks. In this paper, we ran an adversarial test on learnable image encryption in five different scenarios. The results show different behaviors of the network in the variable key scenarios and suggest learnable image encryption provides certain level of adversarial robustness.
Tasks
Published 2019-07-31
URL https://arxiv.org/abs/1907.13342v1
PDF https://arxiv.org/pdf/1907.13342v1.pdf
PWC https://paperswithcode.com/paper/adversarial-test-on-learnable-image
Repo
Framework
comments powered by Disqus