February 1, 2020

3230 words 16 mins read

Paper Group AWR 213

Paper Group AWR 213

CSGAN: Cyclic-Synthesized Generative Adversarial Networks for Image-to-Image Transformation. RVOS: End-to-End Recurrent Network for Video Object Segmentation. On a Randomized Multi-Block ADMM for Solving Selected Machine Learning Problems. Tensor Completion for Weakly-dependent Data on Graph for Metro Passenger Flow Prediction. Domain Generalizatio …

CSGAN: Cyclic-Synthesized Generative Adversarial Networks for Image-to-Image Transformation

Title CSGAN: Cyclic-Synthesized Generative Adversarial Networks for Image-to-Image Transformation
Authors Kishan Babu Kancharagunta, Shiv Ram Dubey
Abstract The primary motivation of Image-to-Image Transformation is to convert an image of one domain to another domain. Most of the research has been focused on the task of image transformation for a set of pre-defined domains. Very few works are reported that actually developed a common framework for image-to-image transformation for different domains. With the introduction of Generative Adversarial Networks (GANs) as a general framework for the image generation problem, there is a tremendous growth in the area of image-to-image transformation. Most of the research focuses over the suitable objective function for image-to-image transformation. In this paper, we propose a new Cyclic-Synthesized Generative Adversarial Networks (CSGAN) for image-to-image transformation. The proposed CSGAN uses a new objective function (loss) called Cyclic-Synthesized Loss (CS) between the synthesized image of one domain and cycled image of another domain. The performance of the proposed CSGAN is evaluated on two benchmark image-to-image transformation datasets, including CUHK Face dataset and CMP Facades dataset. The results are computed using the widely used evaluation metrics such as MSE, SSIM, PSNR, and LPIPS. The experimental results of the proposed CSGAN approach are compared with the latest state-of-the-art approaches such as GAN, Pix2Pix, DualGAN, CycleGAN and PS2GAN. The proposed CSGAN technique outperforms all the methods over CUHK dataset and exhibits the promising and comparable performance over Facades dataset in terms of both qualitative and quantitative measures. The code is available at https://github.com/KishanKancharagunta/CSGAN.
Tasks Image Generation
Published 2019-01-11
URL http://arxiv.org/abs/1901.03554v1
PDF http://arxiv.org/pdf/1901.03554v1.pdf
PWC https://paperswithcode.com/paper/csgan-cyclic-synthesized-generative
Repo https://github.com/KishanKancharagunta/CSGAN
Framework pytorch

RVOS: End-to-End Recurrent Network for Video Object Segmentation

Title RVOS: End-to-End Recurrent Network for Video Object Segmentation
Authors Carles Ventura, Miriam Bellver, Andreu Girbau, Amaia Salvador, Ferran Marques, Xavier Giro-i-Nieto
Abstract Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence. In our work, we propose a Recurrent network for multiple object Video Object Segmentation (RVOS) that is fully end-to-end trainable. Our model incorporates recurrence on two different domains: (i) the spatial, which allows to discover the different object instances within a frame, and (ii) the temporal, which allows to keep the coherence of the segmented objects along time. We train RVOS for zero-shot video object segmentation and are the first ones to report quantitative results for DAVIS-2017 and YouTube-VOS benchmarks. Further, we adapt RVOS for one-shot video object segmentation by using the masks obtained in previous time steps as inputs to be processed by the recurrent module. Our model reaches comparable results to state-of-the-art techniques in YouTube-VOS benchmark and outperforms all previous video object segmentation methods not using online learning in the DAVIS-2017 benchmark. Moreover, our model achieves faster inference runtimes than previous methods, reaching 44ms/frame on a P100 GPU.
Tasks Semi-supervised Video Object Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation
Published 2019-03-13
URL https://arxiv.org/abs/1903.05612v2
PDF https://arxiv.org/pdf/1903.05612v2.pdf
PWC https://paperswithcode.com/paper/rvos-end-to-end-recurrent-network-for-video
Repo https://github.com/imatge-upc/rvos
Framework pytorch

On a Randomized Multi-Block ADMM for Solving Selected Machine Learning Problems

Title On a Randomized Multi-Block ADMM for Solving Selected Machine Learning Problems
Authors Mingxi Zhu, Kresimir Mihic, Yinyu Ye
Abstract The Alternating Direction Method of Multipliers (ADMM) has now days gained tremendous attentions for solving large-scale machine learning and signal processing problems due to the relative simplicity. However, the two-block structure of the classical ADMM still limits the size of the real problems being solved. When one forces a more-than-two-block structure by variable-splitting, the convergence speed slows down greatly as observed in practice. Recently, a randomly assembled cyclic multi-block ADMM (RAC-MBADMM) was developed by the authors for solving general convex and nonconvex quadratic optimization problems where the number of blocks can go greater than two so that each sub-problem has a smaller size and can be solved much more efficiently. In this paper, we apply this method to solving few selected machine learning problems related to convex quadratic optimization, such as Linear Regression, LASSO, Elastic-Net, and SVM. We prove that the algorithm would converge in expectation linearly under the standard statistical data assumptions. We use our general-purpose solver to conduct multiple numerical tests, solving both synthetic and large-scale bench-mark problems. Our results show that RAC-MBADMM could significantly outperform, in both solution time and quality, other optimization algorithms/codes for solving these machine learning problems, and match up the performance of the best tailored methods such as Glmnet or LIBSVM. In certain problem regions RAC-MBADMM even achieves a superior performance than that of the tailored methods.
Tasks
Published 2019-07-03
URL https://arxiv.org/abs/1907.01995v2
PDF https://arxiv.org/pdf/1907.01995v2.pdf
PWC https://paperswithcode.com/paper/on-a-randomized-multi-block-admm-for-solving
Repo https://github.com/kmihic/RACQP
Framework none

Tensor Completion for Weakly-dependent Data on Graph for Metro Passenger Flow Prediction

Title Tensor Completion for Weakly-dependent Data on Graph for Metro Passenger Flow Prediction
Authors Ziyue Li, Nurettin Dorukhan Sergin, Hao Yan, Chen Zhang, Fugee Tsung
Abstract Low-rank tensor decomposition and completion have attracted significant interest from academia given the ubiquity of tensor data. However, the low-rank structure is a global property, which will not be fulfilled when the data presents complex and weak dependencies given specific graph structures. One particular application that motivates this study is the spatiotemporal data analysis. As shown in the preliminary study, weakly dependencies can worsen the low-rank tensor completion performance. In this paper, we propose a novel low-rank CANDECOMP / PARAFAC (CP) tensor decomposition and completion framework by introducing the $L_{1}$-norm penalty and Graph Laplacian penalty to model the weakly dependency on graph. We further propose an efficient optimization algorithm based on the Block Coordinate Descent for efficient estimation. A case study based on the metro passenger flow data in Hong Kong is conducted to demonstrate improved performance over the regular tensor completion methods.
Tasks
Published 2019-12-11
URL https://arxiv.org/abs/1912.05693v1
PDF https://arxiv.org/pdf/1912.05693v1.pdf
PWC https://paperswithcode.com/paper/tensor-completion-for-weakly-dependent-data
Repo https://github.com/bonaldli/WDGTC
Framework none

Domain Generalization Using a Mixture of Multiple Latent Domains

Title Domain Generalization Using a Mixture of Multiple Latent Domains
Authors Toshihiko Matsuura, Tatsuya Harada
Abstract When domains, which represent underlying data distributions, vary during training and testing processes, deep neural networks suffer a drop in their performance. Domain generalization allows improvements in the generalization performance for unseen target domains by using multiple source domains. Conventional methods assume that the domain to which each sample belongs is known in training. However, many datasets, such as those collected via web crawling, contain a mixture of multiple latent domains, in which the domain of each sample is unknown. This paper introduces domain generalization using a mixture of multiple latent domains as a novel and more realistic scenario, where we try to train a domain-generalized model without using domain labels. To address this scenario, we propose a method that iteratively divides samples into latent domains via clustering, and which trains the domain-invariant feature extractor shared among the divided latent domains via adversarial learning. We assume that the latent domain of images is reflected in their style, and thus, utilize style features for clustering. By using these features, our proposed method successfully discovers latent domains and achieves domain generalization even if the domain labels are not given. Experiments show that our proposed method can train a domain-generalized model without using domain labels. Moreover, it outperforms conventional domain generalization methods, including those that utilize domain labels.
Tasks Domain Generalization
Published 2019-11-18
URL https://arxiv.org/abs/1911.07661v1
PDF https://arxiv.org/pdf/1911.07661v1.pdf
PWC https://paperswithcode.com/paper/domain-generalization-using-a-mixture-of
Repo https://github.com/Emma0118/domain-generalization
Framework pytorch

Graphical-model based estimation and inference for differential privacy

Title Graphical-model based estimation and inference for differential privacy
Authors Ryan McKenna, Daniel Sheldon, Gerome Miklau
Abstract Many privacy mechanisms reveal high-level information about a data distribution through noisy measurements. It is common to use this information to estimate the answers to new queries. In this work, we provide an approach to solve this estimation problem efficiently using graphical models, which is particularly effective when the distribution is high-dimensional but the measurements are over low-dimensional marginals. We show that our approach is far more efficient than existing estimation techniques from the privacy literature and that it can improve the accuracy and scalability of many state-of-the-art mechanisms.
Tasks
Published 2019-01-26
URL http://arxiv.org/abs/1901.09136v1
PDF http://arxiv.org/pdf/1901.09136v1.pdf
PWC https://paperswithcode.com/paper/graphical-model-based-estimation-and
Repo https://github.com/ryan112358/private-pgm
Framework pytorch

Estimating Glycemic Impact of Cooking Recipes via Online Crowdsourcing and Machine Learning

Title Estimating Glycemic Impact of Cooking Recipes via Online Crowdsourcing and Machine Learning
Authors Helena Lee, Palakorn Achananuparp, Yue Liu, Ee-Peng Lim, Lav R. Varshney
Abstract Consumption of diets with low glycemic impact is highly recommended for diabetics and pre-diabetics as it helps maintain their blood glucose levels. However, laboratory analysis of dietary glycemic potency is time-consuming and expensive. In this paper, we explore a data-driven approach utilizing online crowdsourcing and machine learning to estimate the glycemic impact of cooking recipes. We show that a commonly used healthiness metric may not always be effective in determining recipes suitable for diabetics, thus emphasizing the importance of the glycemic-impact estimation task. Our best classification model, trained on nutritional and crowdsourced data obtained from Amazon Mechanical Turk (AMT), can accurately identify recipes which are unhealthful for diabetics.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.07881v1
PDF https://arxiv.org/pdf/1909.07881v1.pdf
PWC https://paperswithcode.com/paper/estimating-glycemic-impact-of-cooking-recipes
Repo https://github.com/LARC-CMU-SMU/dph19-glycemic-impact
Framework none

Temporal Consistency Objectives Regularize the Learning of Disentangled Representations

Title Temporal Consistency Objectives Regularize the Learning of Disentangled Representations
Authors Gabriele Valvano, Agisilaos Chartsias, Andrea Leo, Sotirios A. Tsaftaris
Abstract There has been an increasing focus in learning interpretable feature representations, particularly in applications such as medical image analysis that require explainability, whilst relying less on annotated data (since annotations can be tedious and costly). Here we build on recent innovations in style-content representations to learn anatomy, imaging characteristics (appearance) and temporal correlations. By introducing a self-supervised objective of predicting future cardiac phases we improve disentanglement. We propose a temporal transformer architecture that given an image conditioned on phase difference, it predicts a future frame. This forces the anatomical decomposition to be consistent with the temporal cardiac contraction in cine MRI and to have semantic meaning with less need for annotations. We demonstrate that using this regularization, we achieve competitive results and improve semi-supervised segmentation, especially when very few labelled data are available. Specifically, we show Dice increase of up to 19% and 7% compared to supervised and semi-supervised approaches respectively on the ACDC dataset. Code is available at: https://github.com/gvalvano/sdtnet .
Tasks
Published 2019-08-29
URL https://arxiv.org/abs/1908.11330v1
PDF https://arxiv.org/pdf/1908.11330v1.pdf
PWC https://paperswithcode.com/paper/temporal-consistency-objectives-regularize
Repo https://github.com/gvalvano/sdtnet
Framework tf

Automatic Fact-guided Sentence Modification

Title Automatic Fact-guided Sentence Modification
Authors Darsh J Shah, Tal Schuster, Regina Barzilay
Abstract Online encyclopediae like Wikipedia contain large amounts of text that need frequent corrections and updates. The new information may contradict existing content in encyclopediae. In this paper, we focus on rewriting such dynamically changing articles. This is a challenging constrained generation task, as the output must be consistent with the new information and fit into the rest of the existing document. To this end, we propose a two-step solution: (1) We identify and remove the contradicting components in a target text for a given claim, using a neutralizing stance model; (2) We expand the remaining text to be consistent with the given claim, using a novel two-encoder sequence-to-sequence model with copy attention. Applied to a Wikipedia fact update dataset, our method successfully generates updated sentences for new claims, achieving the highest SARI score. Furthermore, we demonstrate that generating synthetic data through such rewritten sentences can successfully augment the FEVER fact-checking training dataset, leading to a relative error reduction of 13%.
Tasks
Published 2019-09-30
URL https://arxiv.org/abs/1909.13838v2
PDF https://arxiv.org/pdf/1909.13838v2.pdf
PWC https://paperswithcode.com/paper/automatic-fact-guided-sentence-modification
Repo https://github.com/TalSchuster/TokenMasker
Framework pytorch

Single-Network Whole-Body Pose Estimation

Title Single-Network Whole-Body Pose Estimation
Authors Gines Hidalgo, Yaadhav Raaj, Haroon Idrees, Donglai Xiang, Hanbyul Joo, Tomas Simon, Yaser Sheikh
Abstract We present the first single-network approach for 2D~whole-body pose estimation, which entails simultaneous localization of body, face, hands, and feet keypoints. Due to the bottom-up formulation, our method maintains constant real-time performance regardless of the number of people in the image. The network is trained in a single stage using multi-task learning, through an improved architecture which can handle scale differences between body/foot and face/hand keypoints. Our approach considerably improves upon OpenPose~\cite{cao2018openpose}, the only work so far capable of whole-body pose estimation, both in terms of speed and global accuracy. Unlike OpenPose, our method does not need to run an additional network for each hand and face candidate, making it substantially faster for multi-person scenarios. This work directly results in a reduction of computational complexity for applications that require 2D whole-body information (e.g., VR/AR, re-targeting). In addition, it yields higher accuracy, especially for occluded, blurry, and low resolution faces and hands. For code, trained models, and validation benchmarks, visit our project page: https://github.com/CMU-Perceptual-Computing-Lab/openpose_train.
Tasks Multi-Task Learning, Pose Estimation
Published 2019-09-30
URL https://arxiv.org/abs/1909.13423v1
PDF https://arxiv.org/pdf/1909.13423v1.pdf
PWC https://paperswithcode.com/paper/single-network-whole-body-pose-estimation
Repo https://github.com/CMU-Perceptual-Computing-Lab/openpose_train
Framework none

Controlling Neural Level Sets

Title Controlling Neural Level Sets
Authors Matan Atzmon, Niv Haim, Lior Yariv, Ofer Israelov, Haggai Maron, Yaron Lipman
Abstract The level sets of neural networks represent fundamental properties such as decision boundaries of classifiers and are used to model non-linear manifold data such as curves and surfaces. Thus, methods for controlling the neural level sets could find many applications in machine learning. In this paper we present a simple and scalable approach to directly control level sets of a deep neural network. Our method consists of two parts: (i) sampling of the neural level sets, and (ii) relating the samples’ positions to the network parameters. The latter is achieved by a sample network that is constructed by adding a single fixed linear layer to the original network. In turn, the sample network can be used to incorporate the level set samples into a loss function of interest. We have tested our method on three different learning tasks: improving generalization to unseen data, training networks robust to adversarial attacks, and curve and surface reconstruction from point clouds. For surface reconstruction, we produce high fidelity surfaces directly from raw 3D point clouds. When training small to medium networks to be robust to adversarial attacks we obtain robust accuracy comparable to state-of-the-art methods.
Tasks
Published 2019-05-28
URL https://arxiv.org/abs/1905.11911v2
PDF https://arxiv.org/pdf/1905.11911v2.pdf
PWC https://paperswithcode.com/paper/controlling-neural-level-sets
Repo https://github.com/matanatz/ControllingNeuralLevelsets
Framework pytorch

Label Efficient Semi-Supervised Learning via Graph Filtering

Title Label Efficient Semi-Supervised Learning via Graph Filtering
Authors Qimai Li, Xiao-Ming Wu, Han Liu, Xiaotong Zhang, Zhichao Guan
Abstract Graph-based methods have been demonstrated as one of the most effective approaches for semi-supervised learning, as they can exploit the connectivity patterns between labeled and unlabeled data samples to improve learning performance. However, existing graph-based methods either are limited in their ability to jointly model graph structures and data features, such as the classical label propagation methods, or require a considerable amount of labeled data for training and validation due to high model complexity, such as the recent neural-network-based methods. In this paper, we address label efficient semi-supervised learning from a graph filtering perspective. Specifically, we propose a graph filtering framework that injects graph similarity into data features by taking them as signals on the graph and applying a low-pass graph filter to extract useful data representations for classification, where label efficiency can be achieved by conveniently adjusting the strength of the graph filter. Interestingly, this framework unifies two seemingly very different methods – label propagation and graph convolutional networks. Revisiting them under the graph filtering framework leads to new insights that improve their modeling capabilities and reduce model complexity. Experiments on various semi-supervised classification tasks on four citation networks and one knowledge graph and one semi-supervised regression task for zero-shot image recognition validate our findings and proposals.
Tasks Graph Similarity
Published 2019-01-28
URL https://arxiv.org/abs/1901.09993v3
PDF https://arxiv.org/pdf/1901.09993v3.pdf
PWC https://paperswithcode.com/paper/generalized-label-propagation-methods-for
Repo https://github.com/liqimai/Efficient-SSL
Framework tf

Cross-Lingual Natural Language Generation via Pre-Training

Title Cross-Lingual Natural Language Generation via Pre-Training
Authors Zewen Chi, Li Dong, Furu Wei, Wenhui Wang, Xian-Ling Mao, Heyan Huang
Abstract In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg.
Tasks Abstractive Text Summarization, Cross-Lingual Transfer, Machine Translation, Question Generation, Text Generation
Published 2019-09-23
URL https://arxiv.org/abs/1909.10481v3
PDF https://arxiv.org/pdf/1909.10481v3.pdf
PWC https://paperswithcode.com/paper/190910481
Repo https://github.com/CZWin32768/xnlg
Framework pytorch

Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification

Title Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification
Authors Ting Chen, Song Bian, Yizhou Sun
Abstract Graph Neural Nets (GNNs) have received increasing attentions, partially due to their superior performance in many node and graph classification tasks. However, there is a lack of understanding on what they are learning and how sophisticated the learned graph functions are. In this work, we first propose Graph Feature Network (GFN), a simple lightweight neural net defined on a set of graph augmented features. We then propose a dissection of GNNs on graph classification into two parts: 1) the graph filtering, where graph-based neighbor aggregations are performed, and 2) the set function, where a set of hidden node features are composed for prediction. We prove that GFN can be derived by linearizing graph filtering part of GNNs, and leverage it to test the importance of the two parts separately. Empirically we perform evaluations on common graph classification benchmarks. To our surprise, we find that, despite the simplification, GFN could match or exceed the best accuracies produced by recently proposed GNNs, with a fraction of computation cost. Our results suggest that linear graph filtering with non-linear set function is powerful enough, and common graph classification benchmarks seem inadequate for testing advanced GNN variants.
Tasks Graph Classification
Published 2019-05-11
URL https://arxiv.org/abs/1905.04579v2
PDF https://arxiv.org/pdf/1905.04579v2.pdf
PWC https://paperswithcode.com/paper/dissecting-graph-neural-networks-on-graph
Repo https://github.com/chentingpc/gfn
Framework pytorch

Recurrent Kernel Networks

Title Recurrent Kernel Networks
Authors Dexiong Chen, Laurent Jacob, Julien Mairal
Abstract Substring kernels are classical tools for representing biological sequences or text. However, when large amounts of annotated data are available, models that allow end-to-end training such as neural networks are often preferred. Links between recurrent neural networks (RNNs) and substring kernels have recently been drawn, by formally showing that RNNs with specific activation functions were points in a reproducing kernel Hilbert space (RKHS). In this paper, we revisit this link by generalizing convolutional kernel networks—originally related to a relaxation of the mismatch kernel—to model gaps in sequences. It results in a new type of recurrent neural network which can be trained end-to-end with backpropagation, or without supervision by using kernel approximation techniques. We experimentally show that our approach is well suited to biological sequences, where it outperforms existing methods for protein classification tasks.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.03200v2
PDF https://arxiv.org/pdf/1906.03200v2.pdf
PWC https://paperswithcode.com/paper/recurrent-kernel-networks
Repo https://github.com/claying/RKN
Framework pytorch
comments powered by Disqus