February 1, 2020

3280 words 16 mins read

Paper Group AWR 344

Paper Group AWR 344

LGLMF: Local Geographical based Logistic Matrix Factorization Model for POI Recommendation. TabFact: A Large-scale Dataset for Table-based Fact Verification. U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging. Background Suppression Network for Weakly-supervised Temporal Action Localization. Relational Word …

LGLMF: Local Geographical based Logistic Matrix Factorization Model for POI Recommendation

Title LGLMF: Local Geographical based Logistic Matrix Factorization Model for POI Recommendation
Authors Hossein A. Rahmani, Mohammad Aliannejadi, Sajad Ahmadian, Mitra Baratchi, Mohsen Afsharchi, Fabio Crestani
Abstract With the rapid growth of Location-Based Social Networks, personalized Points of Interest (POIs) recommendation has become a critical task to help users explore their surroundings. Due to the scarcity of check-in data, the availability of geographical information offers an opportunity to improve the accuracy of POI recommendation. Moreover, matrix factorization methods provide effective models which can be used in POI recommendation. However, there are two main challenges which should be addressed to improve the performance of POI recommendation methods. First, leveraging geographical information to capture both the user’s personal, geographic profile and a location’s geographic popularity. Second, incorporating the geographical model into the matrix factorization approaches. To address these problems, a POI recommendation method is proposed in this paper based on a Local Geographical Model, which considers both users’ and locations’ points of view. To this end, an effective geographical model is proposed by considering the user’s main region of activity and the relevance of each location within that region. Then, the proposed local geographical model is fused into the Logistic Matrix Factorization to improve the accuracy of POI recommendation. Experimental results on two well-known datasets demonstrate that the proposed approach outperforms other state-of-the-art POI recommendation methods.
Tasks
Published 2019-09-14
URL https://arxiv.org/abs/1909.06667v1
PDF https://arxiv.org/pdf/1909.06667v1.pdf
PWC https://paperswithcode.com/paper/lglmf-local-geographical-based-logistic
Repo https://github.com/rahmanidashti/LGLMF
Framework none

TabFact: A Large-scale Dataset for Table-based Fact Verification

Title TabFact: A Large-scale Dataset for Table-based Fact Verification
Authors Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, William Yang Wang
Abstract The problem of verifying whether a textual hypothesis holds based on the given evidence, also known as fact verification, plays an important role in the study of natural language understanding and semantic representation. However, existing studies are mainly restricted to dealing with unstructured evidence (e.g., natural language sentences and documents, news, etc), while verification under structured evidence, such as tables, graphs, and databases, remains under-explored. This paper specifically aims to study the fact verification given semi-structured data as evidence. To this end, we construct a large-scale dataset called TabFact with 16k Wikipedia tables as the evidence for 118k human-annotated natural language statements, which are labeled as either ENTAILED or REFUTED. TabFact is challenging since it involves both soft linguistic reasoning and hard symbolic reasoning. To address these reasoning challenges, we design two different models: Table-BERT and Latent Program Algorithm (LPA). Table-BERT leverages the state-of-the-art pre-trained language model to encode the linearized tables and statements into continuous vectors for verification. LPA parses statements into programs and executes them against the tables to obtain the returned binary value for verification. Both methods achieve similar accuracy but still lag far behind human performance. We also perform a comprehensive analysis to demonstrate great future opportunities. The data and code of the dataset are provided in \url{https://github.com/wenhuchen/Table-Fact-Checking}.
Tasks Language Modelling, Table-based Fact Verification
Published 2019-09-05
URL https://arxiv.org/abs/1909.02164v4
PDF https://arxiv.org/pdf/1909.02164v4.pdf
PWC https://paperswithcode.com/paper/tabfact-a-large-scale-dataset-for-table-based
Repo https://github.com/wenhuchen/Table-Fact-Checking
Framework pytorch

U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging

Title U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging
Authors Mathias Perslev, Michael Hejselbak Jensen, Sune Darkner, Poul Jørgen Jennum, Christian Igel
Abstract Neural networks are becoming more and more popular for the analysis of physiological time-series. The most successful deep learning systems in this domain combine convolutional and recurrent layers to extract useful features to model temporal relations. Unfortunately, these recurrent models are difficult to tune and optimize. In our experience, they often require task-specific modifications, which makes them challenging to use for non-experts. We propose U-Time, a fully feed-forward deep learning approach to physiological time series segmentation developed for the analysis of sleep data. U-Time is a temporal fully convolutional network based on the U-Net architecture that was originally proposed for image segmentation. U-Time maps sequential inputs of arbitrary length to sequences of class labels on a freely chosen temporal scale. This is done by implicitly classifying every individual time-point of the input signal and aggregating these classifications over fixed intervals to form the final predictions. We evaluated U-Time for sleep stage classification on a large collection of sleep electroencephalography (EEG) datasets. In all cases, we found that U-Time reaches or outperforms current state-of-the-art deep learning models while being much more robust in the training process and without requiring architecture or hyperparameter adaptation across tasks.
Tasks EEG, Semantic Segmentation, Time Series
Published 2019-10-24
URL https://arxiv.org/abs/1910.11162v1
PDF https://arxiv.org/pdf/1910.11162v1.pdf
PWC https://paperswithcode.com/paper/u-time-a-fully-convolutional-network-for-time
Repo https://github.com/perslev/U-Time
Framework none

Background Suppression Network for Weakly-supervised Temporal Action Localization

Title Background Suppression Network for Weakly-supervised Temporal Action Localization
Authors Pilhyeon Lee, Youngjung Uh, Hyeran Byun
Abstract Weakly-supervised temporal action localization is a very challenging problem because frame-wise labels are not given in the training stage while the only hint is video-level labels: whether each video contains action frames of interest. Previous methods aggregate frame-level class scores to produce video-level prediction and learn from video-level action labels. This formulation does not fully model the problem in that background frames are forced to be misclassified as action classes to predict video-level labels accurately. In this paper, we design Background Suppression Network (BaS-Net) which introduces an auxiliary class for background and has a two-branch weight-sharing architecture with an asymmetrical training strategy. This enables BaS-Net to suppress activations from background frames to improve localization performance. Extensive experiments demonstrate the effectiveness of BaS-Net and its superiority over the state-of-the-art methods on the most popular benchmarks - THUMOS’14 and ActivityNet. Our code and the trained model are available at https://github.com/Pilhyeon/BaSNet-pytorch.
Tasks Action Localization, Temporal Action Localization, Weakly Supervised Action Localization, Weakly-supervised Temporal Action Localization
Published 2019-11-22
URL https://arxiv.org/abs/1911.09963v1
PDF https://arxiv.org/pdf/1911.09963v1.pdf
PWC https://paperswithcode.com/paper/background-suppression-network-for-weakly
Repo https://github.com/Pilhyeon/BaSNet-pytorch
Framework pytorch

Relational Word Embeddings

Title Relational Word Embeddings
Authors Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert
Abstract While word embeddings have been shown to implicitly encode various forms of attributional knowledge, the extent to which they capture relational information is far more limited. In previous work, this limitation has been addressed by incorporating relational knowledge from external knowledge bases when learning the word embedding. Such strategies may not be optimal, however, as they are limited by the coverage of available resources and conflate similarity with other forms of relatedness. As an alternative, in this paper we propose to encode relational knowledge in a separate word embedding, which is aimed to be complementary to a given standard word embedding. This relational word embedding is still learned from co-occurrence statistics, and can thus be used even when no external knowledge base is available. Our analysis shows that relational word vectors do indeed capture information that is complementary to what is encoded in standard word embeddings.
Tasks Word Embeddings
Published 2019-06-04
URL https://arxiv.org/abs/1906.01373v1
PDF https://arxiv.org/pdf/1906.01373v1.pdf
PWC https://paperswithcode.com/paper/relational-word-embeddings
Repo https://github.com/pedrada88/rwe
Framework pytorch

Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses

Title Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses
Authors Ulysse Marteau-Ferey, Francis Bach, Alessandro Rudi
Abstract In this paper, we study large-scale convex optimization algorithms based on the Newton method applied to regularized generalized self-concordant losses, which include logistic regression and softmax regression. We first prove that our new simple scheme based on a sequence of problems with decreasing regularization parameters is provably globally convergent, that this convergence is linear with a constant factor which scales only logarithmically with the condition number. In the parametric setting, we obtain an algorithm with the same scaling than regular first-order methods but with an improved behavior, in particular in ill-conditioned problems. Second, in the non parametric machine learning setting, we provide an explicit algorithm combining the previous scheme with Nystr{"o}m projection techniques, and prove that it achieves optimal generalization bounds with a time complexity of order O(ndf $\lambda$), a memory complexity of order O(df 2 $\lambda$) and no dependence on the condition number, generalizing the results known for least-squares regression. Here n is the number of observations and df $\lambda$ is the associated degrees of freedom. In particular, this is the first large-scale algorithm to solve logistic and softmax regressions in the non-parametric setting with large condition numbers and theoretical guarantees.
Tasks
Published 2019-07-03
URL https://arxiv.org/abs/1907.01771v2
PDF https://arxiv.org/pdf/1907.01771v2.pdf
PWC https://paperswithcode.com/paper/globally-convergent-newton-methods-for-ill
Repo https://github.com/umarteau/Newton-Method-for-GSC-losses-
Framework pytorch

Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators

Title Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators
Authors Reinhard Heckel, Mahdi Soltanolkotabi
Abstract Convolutional Neural Networks (CNNs) have emerged as highly successful tools for image generation, recovery, and restoration. A major contributing factor to this success is that convolutional networks impose strong prior assumptions about natural images. A surprising experiment that highlights this architectural bias towards natural images is that one can remove noise and corruptions from a natural image without using any training data, by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to the corrupted image. While this over-parameterized network can fit the corrupted image perfectly, surprisingly after a few iterations of gradient descent it generates an almost uncorrupted image. This intriguing phenomenon enables state-of-the-art CNN-based denoising and regularization of other inverse problems. In this paper, we attribute this effect to a particular architectural choice of convolutional networks, namely convolutions with fixed interpolating filters. We then formally characterize the dynamics of fitting a two-layer convolutional generator to a noisy signal and prove that early-stopped gradient descent denoises/regularizes. Our proof relies on showing that convolutional generators fit the structured part of an image significantly faster than the corrupted portion.
Tasks Denoising, Image Generation
Published 2019-10-31
URL https://arxiv.org/abs/1910.14634v2
PDF https://arxiv.org/pdf/1910.14634v2.pdf
PWC https://paperswithcode.com/paper/denoising-and-regularization-via-exploiting
Repo https://github.com/MLI-lab/overparameterized_convolutional_generators
Framework pytorch

HalluciNet-ing Spatiotemporal Representations Using 2D-CNN

Title HalluciNet-ing Spatiotemporal Representations Using 2D-CNN
Authors Paritosh Parmar, Brendan Morris
Abstract Spatiotemporal representations learnt using 3D convolutional neural networks (CNN) are currently the state-of-the-art approaches for action related tasks. However, 3D-CNN are notoriously known for being memory and compute resource intensive. 2D-CNN, on the other hand, are much lighter on computing resource requirements, and are faster. However, 2D-CNN performance on action related tasks is generally inferior to that of 3D-CNN. Taking inspiration from the fact that we, humans, can intuit how the actors will act and objects will be manipulated through years of experience and general understanding of the “how the world works,” we suggest a way to combine the best attributes of 2D- and 3D-CNN – we propose to hallucinate spatiotemporal representations as computed by 3D-CNN, using a 2D-CNN. We believe that requiring the 2D-CNN to “see” into the future, would encourage it to gain deeper understanding about actions, and how scenes evolve by providing a stronger supervisory signal. Hallucination task is treated rather as an auxiliary task, while the main task is any other action related task such as, action recognition. Thorough experimental evaluation shows that hallucination task indeed helps improve performance on action recognition, action quality assessment, and dynamic scene recognition. From practical standpoint, being able to hallucinate spatiotemporal representations without an actual 3D-CNN can enable deployment in resource-constrained scenarios such as limited compute power and/or with lower bandwidth. Codebase is available here: https://github.com/ParitoshParmar/HalluciNet.
Tasks Scene Recognition
Published 2019-12-10
URL https://arxiv.org/abs/1912.04430v2
PDF https://arxiv.org/pdf/1912.04430v2.pdf
PWC https://paperswithcode.com/paper/hallucinet-ing-spatiotemporal-representations
Repo https://github.com/ParitoshParmar/HalluciNet
Framework none

Environment Probing Interaction Policies

Title Environment Probing Interaction Policies
Authors Wenxuan Zhou, Lerrel Pinto, Abhinav Gupta
Abstract A key challenge in reinforcement learning (RL) is environment generalization: a policy trained to solve a task in one environment often fails to solve the same task in a slightly different test environment. A common approach to improve inter-environment transfer is to learn policies that are invariant to the distribution of testing environments. However, we argue that instead of being invariant, the policy should identify the specific nuances of an environment and exploit them to achieve better performance. In this work, we propose the ‘Environment-Probing’ Interaction (EPI) policy, a policy that probes a new environment to extract an implicit understanding of that environment’s behavior. Once this environment-specific information is obtained, it is used as an additional input to a task-specific policy that can now perform environment-conditioned actions to solve a task. To learn these EPI-policies, we present a reward function based on transition predictability. Specifically, a higher reward is given if the trajectory generated by the EPI-policy can be used to better predict transitions. We experimentally show that EPI-conditioned task-specific policies significantly outperform commonly used policy generalization methods on novel testing environments.
Tasks
Published 2019-07-26
URL https://arxiv.org/abs/1907.11740v1
PDF https://arxiv.org/pdf/1907.11740v1.pdf
PWC https://paperswithcode.com/paper/environment-probing-interaction-policies-1
Repo https://github.com/Wenxuan-Zhou/EPI
Framework tf

Unsupervised Cross-lingual Representation Learning at Scale

Title Unsupervised Cross-lingual Representation Learning at Scale
Authors Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov
Abstract This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-based masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. Our model, dubbed XLM-R, significantly outperforms multilingual BERT (mBERT) on a variety of cross-lingual benchmarks, including +13.8% average accuracy on XNLI, +12.3% average F1 score on MLQA, and +2.1% average F1 score on NER. XLM-R performs particularly well on low-resource languages, improving 11.8% in XNLI accuracy for Swahili and 9.2% for Urdu over the previous XLM model. We also present a detailed empirical evaluation of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale. Finally, we show, for the first time, the possibility of multilingual modeling without sacrificing per-language performance; XLM-Ris very competitive with strong monolingual models on the GLUE and XNLI benchmarks. We will make XLM-R code, data, and models publicly available.
Tasks Cross-Lingual Transfer, Language Modelling, Representation Learning
Published 2019-11-05
URL https://arxiv.org/abs/1911.02116v1
PDF https://arxiv.org/pdf/1911.02116v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-cross-lingual-representation-1
Repo https://github.com/facebookresearch/XLM
Framework pytorch

ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission

Title ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission
Authors Kexin Huang, Jaan Altosaar, Rajesh Ranganath
Abstract Clinical notes contain information about patients that goes beyond structured data like lab values and medications. However, clinical notes have been underused relative to structured data, because notes are high-dimensional and sparse. This work develops and evaluates representations of clinical notes using bidirectional transformers (ClinicalBERT). ClinicalBERT uncovers high-quality relationships between medical concepts as judged by humans. ClinicalBert outperforms baselines on 30-day hospital readmission prediction using both discharge summaries and the first few days of notes in the intensive care unit. Code and model parameters are available.
Tasks Readmission Prediction
Published 2019-04-10
URL http://arxiv.org/abs/1904.05342v2
PDF http://arxiv.org/pdf/1904.05342v2.pdf
PWC https://paperswithcode.com/paper/clinicalbert-modeling-clinical-notes-and
Repo https://github.com/nwams/ClinicalBERT-Deep-Learning--Predicting-Hospital-Readmission-Using-Transformer
Framework pytorch

Side-Tuning: Network Adaptation via Additive Side Networks

Title Side-Tuning: Network Adaptation via Additive Side Networks
Authors Jeffrey O Zhang, Alexander Sax, Amir Zamir, Leonidas Guibas, Jitendra Malik
Abstract When training a neural network for a desired task, one may prefer to adapt a pre-trained network rather than start with a randomly initialized one – due to lacking enough training data, performing lifelong learning where the system has to learn a new task while being previously trained for other tasks, or wishing to encode priors in the network via preset weights. The most commonly employed approaches for network adaptation are fine-tuning and using the pre-trained network as a fixed feature extractor, among others. In this paper, we propose a straightforward alternative: Side-Tuning. Side-tuning adapts a pre-trained network by training a lightweight “side” network that is fused with the (unchanged) pre-trained network using summation. This simple method works as well as or better than existing solutions while it resolves some of the basic issues with fine-tuning, fixed features, and several other common baselines. In particular, side-tuning is less prone to overfitting when little training data is available, yields better results than using a fixed feature extractor, and does not suffer from catastrophic forgetting in lifelong learning. We demonstrate the performance of side-tuning under a diverse set of scenarios, including lifelong learning (iCIFAR, Taskonomy), reinforcement learning, imitation learning (visual navigation in Habitat), NLP question-answering (SQuAD v2), and single-task transfer learning (Taskonomy), with consistently promising results.
Tasks Imitation Learning, Question Answering, Transfer Learning, Visual Navigation
Published 2019-12-31
URL https://arxiv.org/abs/1912.13503v1
PDF https://arxiv.org/pdf/1912.13503v1.pdf
PWC https://paperswithcode.com/paper/side-tuning-network-adaptation-via-additive-1
Repo https://github.com/jozhang97/side-tuning
Framework pytorch

Metric Learning for Image Registration

Title Metric Learning for Image Registration
Authors Marc Niethammer, Roland Kwitt, Francois-Xavier Vialard
Abstract Image registration is a key technique in medical image analysis to estimate deformations between image pairs. A good deformation model is important for high-quality estimates. However, most existing approaches use ad-hoc deformation models chosen for mathematical convenience rather than to capture observed data variation. Recent deep learning approaches learn deformation models directly from data. However, they provide limited control over the spatial regularity of transformations. Instead of learning the entire registration approach, we learn a spatially-adaptive regularizer within a registration model. This allows controlling the desired level of regularity and preserving structural properties of a registration model. For example, diffeomorphic transformations can be attained. Our approach is a radical departure from existing deep learning approaches to image registration by embedding a deep learning model in an optimization-based registration algorithm to parameterize and data-adapt the registration model itself.
Tasks Deformable Medical Image Registration, Diffeomorphic Medical Image Registration, Image Registration, Metric Learning
Published 2019-04-21
URL http://arxiv.org/abs/1904.09524v1
PDF http://arxiv.org/pdf/1904.09524v1.pdf
PWC https://paperswithcode.com/paper/metric-learning-for-image-registration
Repo https://github.com/uncbiag/registration
Framework pytorch

LumiPath – Towards Real-time Physically-based Rendering on Embedded Devices

Title LumiPath – Towards Real-time Physically-based Rendering on Embedded Devices
Authors Laura Fink, Sing Chun Lee, Jie Ying Wu, Xingtong Liu, Tianyu Song, Yordanka Stoyanova, Marc Stamminger, Nassir Navab, Mathias Unberath
Abstract With the increasing computational power of today’s workstations, real-time physically-based rendering is within reach, rapidly gaining attention across a variety of domains. These have expeditiously applied to medicine, where it is a powerful tool for intuitive 3D data visualization. Embedded devices such as optical see-through head-mounted displays (OST HMDs) have been a trend for medical augmented reality. However, leveraging the obvious benefits of physically-based rendering remains challenging on these devices because of limited computational power, memory usage, and power consumption. We navigate the compromise between device limitations and image quality to achieve reasonable rendering results by introducing a novel light field that can be sampled in real-time on embedded devices. We demonstrate its applications in medicine and discuss limitations of the proposed method. An open-source version of this project is available at https://github.com/lorafib/LumiPath which provides full insight on implementation and exemplary demonstrational material.
Tasks Image Generation
Published 2019-03-09
URL https://arxiv.org/abs/1903.03837v2
PDF https://arxiv.org/pdf/1903.03837v2.pdf
PWC https://paperswithcode.com/paper/lumipath-towards-real-time-physically-based
Repo https://github.com/lorafib/LumiPath
Framework pytorch

Collaborative Sampling in Generative Adversarial Networks

Title Collaborative Sampling in Generative Adversarial Networks
Authors Yuejiang Liu, Parth Kothari, Alexandre Alahi
Abstract The standard practice in Generative Adversarial Networks (GANs) discards the discriminator during sampling. However, this sampling method loses valuable information learned by the discriminator regarding the data distribution. In this work, we propose a collaborative sampling scheme between the generator and the discriminator for improved data generation. Guided by the discriminator, our approach refines the generated samples through gradient-based updates at a particular layer of the generator, shifting the generator distribution closer to the real data distribution. Additionally, we present a practical discriminator shaping method that can smoothen the loss landscape provided by the discriminator for effective sample refinement. Through extensive experiments on synthetic and image datasets, we demonstrate that our proposed method can improve generated samples both quantitatively and qualitatively, offering a new degree of freedom in GAN sampling.
Tasks Image Generation
Published 2019-02-02
URL https://arxiv.org/abs/1902.00813v3
PDF https://arxiv.org/pdf/1902.00813v3.pdf
PWC https://paperswithcode.com/paper/collaborative-gan-sampling
Repo https://github.com/vita-epfl/collaborative-gan-sampling
Framework tf
comments powered by Disqus