January 28, 2020

3034 words 15 mins read

Paper Group ANR 841

Toward Metrics for Differentiating Out-of-Distribution Sets. Broad-Coverage Semantic Parsing as Transduction. Comprehensive Analysis of Aspect Term Extraction Methods using Various Text Embeddings. Cross-Entropy Adversarial View Adaptation for Person Re-identification. A novel model for query expansion using pseudo-relevant web knowledge. Disentang …

Toward Metrics for Differentiating Out-of-Distribution Sets


Title	Toward Metrics for Differentiating Out-of-Distribution Sets
Authors	Mahdieh Abbasi, Changjian Shui, Arezoo Rajabi, Christian Gagne, Rakesh Bobba
Abstract	Vanilla CNNs, as uncalibrated classifiers, suffer from classifying out-of-distribution (OOD) samples nearly as confidently as in-distribution samples, making them indistinguishable from each other. To tackle this challenge, some recent works have demonstrated the gains of leveraging readily accessible OOD sets for training end-to-end calibrated CNNs. However, a critical question remains unanswered in these works: how to select an OOD set, among the available OOD sets, for training such CNNs that induces high detection rates on unseen OOD sets? We address this pivotal question through the use of Augmented-CNN (A-CNN) involving an explicit rejection option. We first provide a formal definition to precisely differentiate OOD sets for the purpose of selection. As using this definition incurs a huge computational cost, we propose novel metrics, as a computationally efficient tool, for characterizing OOD sets in order to select the proper one. In a series of experiments on several image and audio benchmarks, we show that training an A-CNN with an OOD set identified by our metrics (called A-CNN$^{\star}$) leads to remarkable detection rate of unseen OOD sets while maintaining in-distribution generalization performance, thus demonstrating the viability of our metrics for identifying the proper OOD set. Furthermore, we show that A-CNN$^{\star}$ outperforms state-of-the-art OOD detectors across different benchmarks.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08650v1
PDF	https://arxiv.org/pdf/1910.08650v1.pdf
PWC	https://paperswithcode.com/paper/toward-metrics-for-differentiating-out-of
Repo
Framework

Broad-Coverage Semantic Parsing as Transduction


Title	Broad-Coverage Semantic Parsing as Transduction
Authors	Sheng Zhang, Xutai Ma, Kevin Duh, Benjamin Van Durme
Abstract	We unify different broad-coverage semantic parsing tasks under a transduction paradigm, and propose an attention-based neural framework that incrementally builds a meaning representation via a sequence of semantic relations. By leveraging multiple attention mechanisms, the transducer can be effectively trained without relying on a pre-trained aligner. Experiments conducted on three separate broad-coverage semantic parsing tasks – AMR, SDP and UCCA – demonstrate that our attention-based neural transducer improves the state of the art on both AMR and UCCA, and is competitive with the state of the art on SDP.
Tasks	Semantic Parsing
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02607v2
PDF	https://arxiv.org/pdf/1909.02607v2.pdf
PWC	https://paperswithcode.com/paper/broad-coverage-semantic-parsing-as
Repo
Framework

Comprehensive Analysis of Aspect Term Extraction Methods using Various Text Embeddings


Title	Comprehensive Analysis of Aspect Term Extraction Methods using Various Text Embeddings
Authors	Łukasz Augustyniak, Tomasz Kajdanowicz, Przemysław Kazienko
Abstract	Recently, a variety of model designs and methods have blossomed in the context of the sentiment analysis domain. However, there is still a lack of wide and comprehensive studies of aspect-based sentiment analysis (ABSA). We want to fill this gap and propose a comparison with ablation analysis of aspect term extraction using various text embedding methods. We particularly focused on architectures based on long short-term memory (LSTM) with optional conditional random field (CRF) enhancement using different pre-trained word embeddings. Moreover, we analyzed the influence on the performance of extending the word vectorization step with character embedding. The experimental results on SemEval datasets revealed that not only does bi-directional long short-term memory (BiLSTM) outperform regular LSTM, but also word embedding coverage and its source highly affect aspect detection performance. An additional CRF layer consistently improves the results as well.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis, Word Embeddings
Published	2019-09-11
URL	https://arxiv.org/abs/1909.04917v1
PDF	https://arxiv.org/pdf/1909.04917v1.pdf
PWC	https://paperswithcode.com/paper/comprehensive-analysis-of-aspect-term
Repo
Framework

Cross-Entropy Adversarial View Adaptation for Person Re-identification


Title	Cross-Entropy Adversarial View Adaptation for Person Re-identification
Authors	Lin Wu, Richang Hong, Yang Wang, Meng Wang
Abstract	Person re-identification (re-ID) is a task of matching pedestrians under disjoint camera views. To recognise paired snapshots, it has to cope with large cross-view variations caused by the camera view shift. Supervised deep neural networks are effective in producing a set of non-linear projections that can transform cross-view images into a common feature space. However, they typically impose a symmetric architecture, yielding the network ill-conditioned on its optimisation. In this paper, we learn view-invariant subspace for person re-ID, and its corresponding similarity metric using an adversarial view adaptation approach. The main contribution is to learn coupled asymmetric mappings regarding view characteristics which are adversarially trained to address the view discrepancy by optimising the cross-entropy view confusion objective. To determine the similarity value, the network is empowered with a similarity discriminator to promote features that are highly discriminant in distinguishing positive and negative pairs. The other contribution includes an adaptive weighing on the most difficult samples to address the imbalance of within/between-identity pairs. Our approach achieves notable improved performance in comparison to state-of-the-arts on benchmark datasets.
Tasks	Person Re-Identification
Published	2019-04-03
URL	https://arxiv.org/abs/1904.01755v2
PDF	https://arxiv.org/pdf/1904.01755v2.pdf
PWC	https://paperswithcode.com/paper/cross-entropy-adversarial-view-adaptation-for
Repo
Framework

A novel model for query expansion using pseudo-relevant web knowledge


Title	A novel model for query expansion using pseudo-relevant web knowledge
Authors	Hiteshwar Kumar Azad, Akshay Deepak
Abstract	In the field of information retrieval, query expansion (QE) has long been used as a technique to deal with the fundamental issue of word mismatch between a user’s query and the target information. In the context of the relationship between the query and expanded terms, existing weighting techniques often fail to appropriately capture the term-term relationship and term to the whole query relationship, resulting in low retrieval effectiveness. Our proposed QE approach addresses this by proposing three weighting models based on (1) tf-itf, (2) k-nearest neighbor (kNN) based cosine similarity, and (3) correlation score. Further, to extract the initial set of expanded terms, we use pseudo-relevant web knowledge consisting of the top N web pages returned by the three popular search engines namely, Google, Bing, and DuckDuckGo, in response to the original query. Among the three weighting models, tf-itf scores each of the individual terms obtained from the web content, kNN-based cosine similarity scores the expansion terms to obtain the term-term relationship, and correlation score weighs the selected expansion terms with respect to the whole query. The proposed model, called web knowledge based query expansion (WKQE), achieves an improvement of 25.89% on the MAP score and 30.83% on the GMAP score over the unexpanded queries on the FIRE dataset. A comparative analysis of the WKQE techniques with other related approaches clearly shows significant improvement in the retrieval performance. We have also analyzed the effect of varying the number of pseudo-relevant documents and expansion terms on the retrieval effectiveness of the proposed model.
Tasks	Information Retrieval
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10193v1
PDF	https://arxiv.org/pdf/1908.10193v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-model-for-query-expansion-using
Repo
Framework

Disentangled State Space Representations


Title	Disentangled State Space Representations
Authors	Đorđe Miladinović, Muhammad Waleed Gondal, Bernhard Schölkopf, Joachim M. Buhmann, Stefan Bauer
Abstract	Sequential data often originates from diverse domains across which statistical regularities and domain specifics exist. To specifically learn cross-domain sequence representations, we introduce disentangled state space models (DSSM) – a class of SSM in which domain-invariant state dynamics is explicitly disentangled from domain-specific information governing that dynamics. We analyze how such separation can improve knowledge transfer to new domains, and enable robust prediction, sequence manipulation and domain characterization. We furthermore propose an unsupervised VAE-based training procedure to implement DSSM in form of Bayesian filters. In our experiments, we applied VAE-DSSM framework to achieve competitive performance in online ODE system identification and regression across experimental settings, and controlled generation and prediction of bouncing ball video sequences across varying gravitational influences.
Tasks	Transfer Learning
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03255v1
PDF	https://arxiv.org/pdf/1906.03255v1.pdf
PWC	https://paperswithcode.com/paper/disentangled-state-space-representations
Repo
Framework

Characterizing the dynamics of learning in repeated reference games


Title	Characterizing the dynamics of learning in repeated reference games
Authors	Robert D. Hawkins, Michael C. Frank, Noah D. Goodman
Abstract	The language we use over the course of conversation changes as we establish common ground and learn what our partner finds meaningful. Here we draw upon recent advances in natural language processing to provide a finer-grained characterization of the dynamics of this learning process. We release an open corpus (>15,000 utterances) of extended dyadic interactions in a classic repeated reference game task where pairs of participants had to coordinate on how to refer to initially difficult-to-describe tangram stimuli. We find that different pairs discover a wide variety of idiosyncratic but efficient and stable solutions to the problem of reference. Furthermore, these conventions are shaped by the communicative context: words that are more discriminative in the initial context (i.e. that are used for one target more than others) are more likely to persist through the final repetition. Finally, we find systematic structure in how a speaker’s referring expressions become more efficient over time: syntactic units drop out in clusters following positive feedback from the listener, eventually leaving short labels containing open-class parts of speech. These findings provide a higher resolution look at the quantitative dynamics of ad hoc convention formation and support further development of computational models of learning in communication.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07199v1
PDF	https://arxiv.org/pdf/1912.07199v1.pdf
PWC	https://paperswithcode.com/paper/characterizing-the-dynamics-of-learning-in
Repo
Framework

Tertiary Eye Movement Classification by a Hybrid Algorithm


Title	Tertiary Eye Movement Classification by a Hybrid Algorithm
Authors	Samuel-Hunter Berndt, Douglas Kirkpatrick, Timothy Taviano, Oleg Komogortsev
Abstract	The proper classification of major eye movements, saccades, fixations, and smooth pursuits, remains essential to utilizing eye-tracking data. There is difficulty in separating out smooth pursuits from the other behavior types, particularly from fixations. To this end, we propose a new offline algorithm, I-VDT-HMM, for tertiary classification of eye movements. The algorithm combines the simplicity of two foundational algorithms, I-VT and I-DT, as has been implemented in I-VDT, with the statistical predictive power of the Viterbi algorithm. We evaluate the fitness across a dataset of eight eye movement records at eight sampling rates gathered from previous research, with a comparison to the current state-of-the-art using the proposed quantitative and qualitative behavioral scores. The proposed algorithm achieves promising results in clean high sampling frequency data and with slight modifications could show similar results with lower quality data. Though, the statistical aspect of the algorithm comes at a cost of classification time.
Tasks	Eye Tracking
Published	2019-04-22
URL	http://arxiv.org/abs/1904.10085v1
PDF	http://arxiv.org/pdf/1904.10085v1.pdf
PWC	https://paperswithcode.com/paper/tertiary-eye-movement-classification-by-a
Repo
Framework

Privacy-Preserving Distributed Learning with Secret Gradient Descent


Title	Privacy-Preserving Distributed Learning with Secret Gradient Descent
Authors	Valentin Hartmann, Robert West
Abstract	In many important application domains of machine learning, data is a privacy-sensitive resource. In addition, due to the growing complexity of the models, single actors typically do not have sufficient data to train a model on their own. Motivated by these challenges, we propose Secret Gradient Descent (SecGD), a method for training machine learning models on data that is spread over different clients while preserving the privacy of the training data. We achieve this by letting each client add temporary noise to the information they send to the server during the training process. They also share this noise in separate messages with the server, which can then subtract it from the previously received values. By routing all data through an anonymization network such as Tor, we prevent the server from knowing which messages originate from the same client, which in turn allows us to show that breaking a client’s privacy is computationally intractable as it would require solving a hard instance of the subset sum problem. This setup allows SecGD to work in the presence of only two honest clients and a malicious server, and without the need for peer-to-peer connections.
Tasks
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11993v1
PDF	https://arxiv.org/pdf/1906.11993v1.pdf
PWC	https://paperswithcode.com/paper/privacy-preserving-distributed-learning-with
Repo
Framework

Investigating Task-driven Latent Feasibility for Nonconvex Image Modeling


Title	Investigating Task-driven Latent Feasibility for Nonconvex Image Modeling
Authors	Risheng Liu, Pan Mu, Jian Chen, Xin Fan, Zhongxuan Luo
Abstract	Properly modeling the latent image distributions always plays a key role in a variety of low-level vision problems. Most existing approaches, such as Maximum A Posterior (MAP), aimed at establishing optimization models with prior regularization to address this task. However, designing sophisticated priors may lead to challenging optimization model and time-consuming iteration process. Recent studies tried to embed learnable network architectures into the MAP scheme. Unfortunately, for the MAP model with deeply trained priors, the exact behaviors and the inference process are actually hard to investigate, due to their inexact and uncontrolled nature. In this work, by investigating task-driven latent feasibility for the MAP-based model, we provide a new perspective to enforce domain knowledge and data distributions to MAP-based image modeling. Specifically, we first introduce an energy-based feasibility constraint to the given MAP model. By introducing the proximal gradient updating scheme to the objective and performing an adaptive averaging process, we obtain a completely new MAP inference process, named Proximal Average Optimization (PAO), for image modeling. Owning to the flexibility of PAO, we can also incorporate deeply trained architectures into the feasibility module. Finally, we provide a simple monotone descent-based control mechanism to guide the propagation of PAO. We prove in theory that the sequence generated by both our PAO and its learning-based extension can successfully converge to the critical point of the original MAP optimization task. We demonstrate how to apply our framework to address different vision applications. Extensive experiments verify the theoretical results and show the advantages of our method against existing state-of-the-art approaches.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08242v1
PDF	https://arxiv.org/pdf/1910.08242v1.pdf
PWC	https://paperswithcode.com/paper/investigating-task-driven-latent-feasibility
Repo
Framework

On the Influence of Bias-Correction on Distributed Stochastic Optimization


Title	On the Influence of Bias-Correction on Distributed Stochastic Optimization
Authors	Kun Yuan, Sulaiman A. Alghunaim, Bicheng Ying, Ali H. Sayed
Abstract	Various bias-correction methods such as EXTRA, gradient tracking methods, and exact diffusion have been proposed recently to solve distributed {\em deterministic} optimization problems. These methods employ constant step-sizes and converge linearly to the {\em exact} solution under proper conditions. However, their performance under stochastic and adaptive settings is less explored. It is still unknown {\em whether}, {\em when} and {\em why} these bias-correction methods can outperform their traditional counterparts (such as consensus and diffusion) with noisy gradient and constant step-sizes. This work studies the performance of exact diffusion under the stochastic and adaptive setting, and provides conditions under which exact diffusion has superior steady-state mean-square deviation (MSD) performance than traditional algorithms without bias-correction. In particular, it is proven that this superiority is more evident over sparsely-connected network topologies such as lines, cycles, or grids. Conditions are also provided under which exact diffusion method match or may even degrade the performance of traditional methods. Simulations are provided to validate the theoretical findings.
Tasks	Stochastic Optimization
Published	2019-03-26
URL	https://arxiv.org/abs/1903.10956v2
PDF	https://arxiv.org/pdf/1903.10956v2.pdf
PWC	https://paperswithcode.com/paper/on-the-performance-of-exact-diffusion-over
Repo
Framework

Some compact notations for concentration inequalities and user-friendly results


Title	Some compact notations for concentration inequalities and user-friendly results
Authors	Kaizheng Wang
Abstract	This paper presents compact notations for concentration inequalities and convenient results to streamline probabilistic analysis. The new expressions describe the typical sizes and tails of random variables, allowing for simple operations without heavy use of inessential constants. They bridge classical asymptotic notations and modern non-asymptotic tail bounds together. Examples of different kinds demonstrate their efficacy.
Tasks
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13463v1
PDF	https://arxiv.org/pdf/1912.13463v1.pdf
PWC	https://paperswithcode.com/paper/some-compact-notations-for-concentration
Repo
Framework

Identification of Interaction Clusters Using a Semi-supervised Hierarchical Clustering Method


Title	Identification of Interaction Clusters Using a Semi-supervised Hierarchical Clustering Method
Authors	Yu Chen, Yuanyuan Yang, Yaochu Jin, Xiufen Zou
Abstract	Motivation: Identifying interaction clusters of large gene regulatory networks (GRNs) is critical for its further investigation, while this task is very challenging, attributed to data noise in experiment data, large scale of GRNs, and inconsistency between gene expression profiles and function modules, etc. It is promising to semi-supervise this process by prior information, but shortage of prior information sometimes make it very challenging. Meanwhile, it is also annoying, and sometimes impossible to discovery gold standard for evaluation of clustering results.\ Results: With assistance of an online enrichment tool, this research proposes a semi-supervised hierarchical clustering method via deconvolved correlation matrix~(SHC-DC) to discover interaction clusters of large-scale GRNs. Three benchmark networks including a \emph{Ecoli} network and two \emph{Yeast} networks are employed to test semi-supervision scheme of the proposed method. Then, SHC-DC is utilized to cluster genes in sleep study. Results demonstrates it can find interaction modules that are generally enriched in various signal pathways. Besides the significant influence on blood level of interleukins, impact of sleep on important pathways mediated by them is also validated by the discovered interaction modules.
Tasks
Published	2019-10-20
URL	https://arxiv.org/abs/1910.08864v1
PDF	https://arxiv.org/pdf/1910.08864v1.pdf
PWC	https://paperswithcode.com/paper/identification-of-interaction-clusters-using
Repo
Framework

Sit-to-Stand Analysis in the Wild using Silhouettes for Longitudinal Health Monitoring


Title	Sit-to-Stand Analysis in the Wild using Silhouettes for Longitudinal Health Monitoring
Authors	Alessandro Masullo, Tilo Burghardt, Toby Perrett, Dima Damen, Majid Mirmehdi
Abstract	We present the first fully automated Sit-to-Stand or Stand-to-Sit (StS) analysis framework for long-term monitoring of patients in free-living environments using video silhouettes. Our method adopts a coarse-to-fine time localisation approach, where a deep learning classifier identifies possible StS sequences from silhouettes, and a smart peak detection stage provides fine localisation based on 3D bounding boxes. We tested our method on data from real homes of participants and monitored patients undergoing total hip or knee replacement. Our results show 94.4% overall accuracy in the coarse localisation and an error of 0.026 m/s in the speed of ascent measurement, highlighting important trends in the recuperation of patients who underwent surgery.
Tasks
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01370v1
PDF	https://arxiv.org/pdf/1910.01370v1.pdf
PWC	https://paperswithcode.com/paper/sit-to-stand-analysis-in-the-wild-using
Repo
Framework

Attention Is All You Need for Chinese Word Segmentation


Title	Attention Is All You Need for Chinese Word Segmentation
Authors	Sufeng Duan, Hai Zhao
Abstract	This paper presents a fast and accurate Chinese word segmentation (CWS) model with only unigram feature and greedy decoding algorithm. Our model uses only attention mechanism for network block building. In detail, we adopt a Transformer-based encoder empowered by self-attention mechanism as backbone to take input representation. Then we extend the Transformer encoder with our proposed Gaussian-masked directional multi-head attention, which is a variant of scaled dot-product attention. At last, a bi-affinal attention scorer is to make segmentation decision in a linear time. Our model is evaluated on SIGHAN Bakeoff benchmark dataset. The experimental results show that with the highest segmentation speed, the proposed attention-only model achieves new state-of-the-art or comparable performance against strong baselines in terms of closed test setting.
Tasks	Chinese Word Segmentation
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14537v1
PDF	https://arxiv.org/pdf/1910.14537v1.pdf
PWC	https://paperswithcode.com/paper/attention-is-all-you-need-for-chinese-word
Repo
Framework