Paper Group ANR 240
Disentangling Overlapping Beliefs by Structured Matrix Factorization. Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction. Variational Inference and Bayesian CNNs for Uncertainty Estimation in Multi-Factorial Bone Age Prediction. GradMix: Multi-source Transfer across Domains and Tasks. From Speech-to-Speech Translation to …
Disentangling Overlapping Beliefs by Structured Matrix Factorization
Title | Disentangling Overlapping Beliefs by Structured Matrix Factorization |
Authors | Chaoqi Yang, Jinyang Li, Ruijie Wang, Shuochao Yao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Tianshi Wang, Tarek F. Abdelzaher |
Abstract | Much work on social media opinion polarization focuses on identifying separate or orthogonal beliefs from media traces, thereby missing points of agreement among different communities. This paper develops a new class of Non-negative Matrix Factorization (NMF) algorithms that allow identification of both agreement and disagreement points when beliefs of different communities partially overlap. Specifically, we propose a novel Belief Structured Matrix Factorization algorithm (BSMF) to identify partially overlapping beliefs in polarized public social media. BSMF is totally unsupervised and considers three types of information: (i) who posted which opinion, (ii) keyword-level message similarity, and (iii) empirically observed social dependency graphs (e.g., retweet graphs), to improve belief separation. In the space of unsupervised belief separation algorithms, the emphasis was mostly given to the problem of identifying disjoint (e.g., conflicting) beliefs. The case when individuals with different beliefs agree on some subset of points was less explored. We observe that social beliefs overlap even in polarized scenarios. Our proposed unsupervised algorithm captures both the latent belief intersections and dissimilarities. We discuss the properties of the algorithm and conduct extensive experiments on both synthetic data and real-world datasets. The results show that our model outperforms all compared baselines by a great margin. |
Tasks | |
Published | 2020-02-13 |
URL | https://arxiv.org/abs/2002.05797v1 |
https://arxiv.org/pdf/2002.05797v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-overlapping-beliefs-by |
Repo | |
Framework | |
Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction
Title | Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction |
Authors | Zhen Wu, Fei Zhao, Xin-Yu Dai, Shujian Huang, Jiajun Chen |
Abstract | Target-oriented opinion words extraction (TOWE) is a new subtask of ABSA, which aims to extract the corresponding opinion words for a given opinion target in a sentence. Recently, neural network methods have been applied to this task and achieve promising results. However, the difficulty of annotation causes the datasets of TOWE to be insufficient, which heavily limits the performance of neural models. By contrast, abundant review sentiment classification data are easily available at online review sites. These reviews contain substantial latent opinions information and semantic patterns. In this paper, we propose a novel model to transfer these opinions knowledge from resource-rich review sentiment classification datasets to low-resource task TOWE. To address the challenges in the transfer process, we design an effective transformation method to obtain latent opinions, then integrate them into TOWE. Extensive experimental results show that our model achieves better performance compared to other state-of-the-art methods and significantly outperforms the base model without transferring opinions knowledge. Further analysis validates the effectiveness of our model. |
Tasks | Sentiment Analysis |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.01989v1 |
https://arxiv.org/pdf/2001.01989v1.pdf | |
PWC | https://paperswithcode.com/paper/latent-opinions-transfer-network-for-target |
Repo | |
Framework | |
Variational Inference and Bayesian CNNs for Uncertainty Estimation in Multi-Factorial Bone Age Prediction
Title | Variational Inference and Bayesian CNNs for Uncertainty Estimation in Multi-Factorial Bone Age Prediction |
Authors | Stefan Eggenreich, Christian Payer, Martin Urschler, Darko Štern |
Abstract | Additionally to the extensive use in clinical medicine, biological age (BA) in legal medicine is used to assess unknown chronological age (CA) in applications where identification documents are not available. Automatic methods for age estimation proposed in the literature are predicting point estimates, which can be misleading without the quantification of predictive uncertainty. In our multi-factorial age estimation method from MRI data, we used the Variational Inference approach to estimate the uncertainty of a Bayesian CNN model. Distinguishing model uncertainty from data uncertainty, we interpreted data uncertainty as biological variation, i.e. the range of possible CA of subjects having the same BA. |
Tasks | Age Estimation |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10819v1 |
https://arxiv.org/pdf/2002.10819v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-inference-and-bayesian-cnns-for |
Repo | |
Framework | |
GradMix: Multi-source Transfer across Domains and Tasks
Title | GradMix: Multi-source Transfer across Domains and Tasks |
Authors | Junnan Li, Ziwei Xu, Yongkang Wong, Qi Zhao, Mohan Kankanhalli |
Abstract | The computer vision community is witnessing an unprecedented rate of new tasks being proposed and addressed, thanks to the deep convolutional networks’ capability to find complex mappings from X to Y. The advent of each task often accompanies the release of a large-scale annotated dataset, for supervised training of deep network. However, it is expensive and time-consuming to manually label sufficient amount of training data. Therefore, it is important to develop algorithms that can leverage off-the-shelf labeled dataset to learn useful knowledge for the target task. While previous works mostly focus on transfer learning from a single source, we study multi-source transfer across domains and tasks (MS-DTT), in a semi-supervised setting. We propose GradMix, a model-agnostic method applicable to any model trained with gradient-based learning rule, to transfer knowledge via gradient descent by weighting and mixing the gradients from all sources during training. GradMix follows a meta-learning objective, which assigns layer-wise weights to the source gradients, such that the combined gradient follows the direction that minimize the loss for a small set of samples from the target dataset. In addition, we propose to adaptively adjust the learning rate for each mini-batch based on its importance to the target task, and a pseudo-labeling method to leverage the unlabeled samples in the target domain. We conduct MS-DTT experiments on two tasks: digit recognition and action recognition, and demonstrate the advantageous performance of the proposed method against multiple baselines. |
Tasks | Meta-Learning, Transfer Learning |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.03264v1 |
https://arxiv.org/pdf/2002.03264v1.pdf | |
PWC | https://paperswithcode.com/paper/gradmix-multi-source-transfer-across-domains |
Repo | |
Framework | |
From Speech-to-Speech Translation to Automatic Dubbing
Title | From Speech-to-Speech Translation to Automatic Dubbing |
Authors | Marcello Federico, Robert Enyedi, Roberto Barra-Chicote, Ritwik Giri, Umut Isik, Arvindh Krishnaswamy, Hassan Sawaf |
Abstract | We present enhancements to a speech-to-speech translation pipeline in order to perform automatic dubbing. Our architecture features neural machine translation generating output of preferred length, prosodic alignment of the translation with the original speech segments, neural text-to-speech with fine tuning of the duration of each utterance, and, finally, audio rendering to enriches text-to-speech output with background noise and reverberation extracted from the original audio. We report on a subjective evaluation of automatic dubbing of excerpts of TED Talks from English into Italian, which measures the perceived naturalness of automatic dubbing and the relative importance of each proposed enhancement. |
Tasks | Machine Translation |
Published | 2020-01-19 |
URL | https://arxiv.org/abs/2001.06785v3 |
https://arxiv.org/pdf/2001.06785v3.pdf | |
PWC | https://paperswithcode.com/paper/from-speech-to-speech-translation-to |
Repo | |
Framework | |
Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion
Title | Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion |
Authors | José Pedro Iglesias, Carl Olsson, Marcus Valtonen Örnhag |
Abstract | Fitting a matrix of a given rank to data in a least squares sense can be done very effectively using 2nd order methods such as Levenberg-Marquardt by explicitly optimizing over a bilinear parameterization of the matrix. In contrast, when applying more general singular value penalties, such as weighted nuclear norm priors, direct optimization over the elements of the matrix is typically used. Due to non-differentiability of the resulting objective function, first order sub-gradient or splitting methods are predominantly used. While these offer rapid iterations it is well known that they become inefficent near the minimum due to zig-zagging and in practice one is therefore often forced to settle for an approximate solution. In this paper we show that more accurate results can in many cases be achieved with 2nd order methods. Our main result shows how to construct bilinear formulations, for a general class of regularizers including weighted nuclear norm penalties, that are provably equivalent to the original problems. With these formulations the regularizing function becomes twice differentiable and 2nd order methods can be applied. We show experimentally, on a number of structure from motion problems, that our approach outperforms state-of-the-art methods. |
Tasks | |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10281v1 |
https://arxiv.org/pdf/2003.10281v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-optimization-of-weighted-nuclear |
Repo | |
Framework | |
Data-Driven Discovery of Coarse-Grained Equations
Title | Data-Driven Discovery of Coarse-Grained Equations |
Authors | Joseph Bakarji, Daniel M. Tartakovsky |
Abstract | We introduce a general method for learning probability density function (PDF) equations from Monte Carlo simulations of partial differential equations with uncertain (random) parameters and forcings. The method relies on sparse linear regression to discover the relevant terms in the PDF equation. Unlike other methods for equation discovery, our approach accounts for salient properties of PDF equations, such as positivity, smoothness and conservation. Our results reveal a promising direction for data-driven discovery of coarse-grained PDEs in general. |
Tasks | |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2002.00790v3 |
https://arxiv.org/pdf/2002.00790v3.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-discovery-of-coarse-grained |
Repo | |
Framework | |
Overly Optimistic Prediction Results on Imbalanced Data: Flaws and Benefits of Applying Over-sampling
Title | Overly Optimistic Prediction Results on Imbalanced Data: Flaws and Benefits of Applying Over-sampling |
Authors | Gilles Vandewiele, Isabelle Dehaene, György Kovács, Lucas Sterckx, Olivier Janssens, Femke Ongenae, Femke De Backere, Filip De Turck, Kristien Roelens, Johan Decruyenaere, Sofie Van Hoecke, Thomas Demeester |
Abstract | Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram database. However, we argue that these results are overly optimistic due to a methodological flaw being made. In this work, we focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets. We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified. Moreover, we evaluate the actual impact of over-sampling on predictive performance, when applied prior to data partitioning, using the same methodologies of related studies, to provide a realistic view of these methodologies’ generalization capabilities. We make our research reproducible by providing all the code under an open license. |
Tasks | |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.06296v1 |
https://arxiv.org/pdf/2001.06296v1.pdf | |
PWC | https://paperswithcode.com/paper/overly-optimistic-prediction-results-on |
Repo | |
Framework | |
ELM-based Frame Synchronization in Burst-Mode Communication Systems with Nonlinear Distortion
Title | ELM-based Frame Synchronization in Burst-Mode Communication Systems with Nonlinear Distortion |
Authors | Chaojin Qing, Wang Yu, Bin Cai, Jiafan Wang, Chuan Huang |
Abstract | In burst-mode communication systems, the quality of frame synchronization (FS) at receivers significantly impacts the overall system performance. To guarantee FS, an extreme learning machine (ELM)-based synchronization method is proposed to overcome the nonlinear distortion caused by nonlinear devices or blocks. In the proposed method, a preprocessing is first performed to capture the coarse features of synchronization metric (SM) by using empirical knowledge. Then, an ELM-based FS network is employed to reduce system’s nonlinear distortion and improve SMs. Experimental results indicate that, compared with existing methods, our approach could significantly reduce the error probability of FS while improve the performance in terms of robustness and generalization. |
Tasks | |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.07599v1 |
https://arxiv.org/pdf/2002.07599v1.pdf | |
PWC | https://paperswithcode.com/paper/elm-based-frame-synchronization-in-burst-mode |
Repo | |
Framework | |
Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames
Title | Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames |
Authors | Osamu Shouno |
Abstract | Recent advances in deep learning have significantly improved performance of video prediction. However, state-of-the-art methods still suffer from blurriness and distortions in their future predictions, especially when there are large motions between frames. To address these issues, we propose a deep residual network with the hierarchical architecture where each layer makes a prediction of future state at different spatial resolution, and these predictions of different layers are merged via top-down connections to generate future frames. We trained our model with adversarial and perceptual loss functions, and evaluated it on a natural video dataset captured by car-mounted cameras. Our model quantitatively outperforms state-of-the-art baselines in future frame prediction on video sequences of both largely and slightly changing frames. Furthermore, our model generates future frames with finer details and textures that are perceptually more realistic than the baselines, especially under fast camera motions. |
Tasks | Video Prediction |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08635v1 |
https://arxiv.org/pdf/2003.08635v1.pdf | |
PWC | https://paperswithcode.com/paper/photo-realistic-video-prediction-on-natural |
Repo | |
Framework | |
Learning Attentive Pairwise Interaction for Fine-Grained Classification
Title | Learning Attentive Pairwise Interaction for Fine-Grained Classification |
Authors | Peiqin Zhuang, Yali Wang, Yu Qiao |
Abstract | Fine-grained classification is a challenging problem, due to subtle differences among highly-confused categories. Most approaches address this difficulty by learning discriminative representation of individual input image. On the other hand, humans can effectively identify contrastive clues by comparing image pairs. Inspired by this fact, this paper proposes a simple but effective Attentive Pairwise Interaction Network (API-Net), which can progressively recognize a pair of fine-grained images by interaction. Specifically, API-Net first learns a mutual feature vector to capture semantic differences in the input pair. It then compares this mutual vector with individual vectors to generate gates for each input image. These distinct gate vectors inherit mutual context on semantic differences, which allow API-Net to attentively capture contrastive clues by pairwise interaction between two images. Additionally, we train API-Net in an end-to-end manner with a score ranking regularization, which can further generalize API-Net by taking feature priorities into account. We conduct extensive experiments on five popular benchmarks in fine-grained classification. API-Net outperforms the recent SOTA methods, i.e., CUB-200-2011 (90.0%), Aircraft(93.9%), Stanford Cars (95.3%), Stanford Dogs (90.3%), and NABirds (88.1%). |
Tasks | Fine-Grained Image Classification |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10191v1 |
https://arxiv.org/pdf/2002.10191v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-attentive-pairwise-interaction-for |
Repo | |
Framework | |
Multiplication fusion of sparse and collaborative-competitive representation for image classification
Title | Multiplication fusion of sparse and collaborative-competitive representation for image classification |
Authors | Zi-Qi Li, Jun Sun, Xiao-Jun Wu, He-Feng Yin |
Abstract | Representation based classification methods have become a hot research topic during the past few years, and the two most prominent approaches are sparse representation based classification (SRC) and collaborative representation based classification (CRC). CRC reveals that it is the collaborative representation rather than the sparsity that makes SRC successful. Nevertheless, the dense representation of CRC may not be discriminative which will degrade its performance for classification tasks. To alleviate this problem to some extent, we propose a new method called sparse and collaborative-competitive representation based classification (SCCRC) for image classification. Firstly, the coefficients of the test sample are obtained by SRC and CCRC, respectively. Then the fused coefficient is derived by multiplying the coefficients of SRC and CCRC. Finally, the test sample is designated to the class that has the minimum residual. Experimental results on several benchmark databases demonstrate the efficacy of our proposed SCCRC. The source code of SCCRC is accessible at https://github.com/li-zi-qi/SCCRC. |
Tasks | Image Classification, Sparse Representation-based Classification |
Published | 2020-01-20 |
URL | https://arxiv.org/abs/2001.07090v1 |
https://arxiv.org/pdf/2001.07090v1.pdf | |
PWC | https://paperswithcode.com/paper/multiplication-fusion-of-sparse-and |
Repo | |
Framework | |
Applying Recent Innovations from NLP to MOOC Student Course Trajectory Modeling
Title | Applying Recent Innovations from NLP to MOOC Student Course Trajectory Modeling |
Authors | Clarence Chen, Zachary Pardos |
Abstract | This paper presents several strategies that can improve neural network-based predictive methods for MOOC student course trajectory modeling, applying multiple ideas previously applied to tackle NLP (Natural Language Processing) tasks. In particular, this paper investigates LSTM networks enhanced with two forms of regularization, along with the more recently introduced Transformer architecture. |
Tasks | |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08333v1 |
https://arxiv.org/pdf/2001.08333v1.pdf | |
PWC | https://paperswithcode.com/paper/applying-recent-innovations-from-nlp-to-mooc |
Repo | |
Framework | |
Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction
Title | Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction |
Authors | Beibei Jin, Yu Hu, Qiankun Tang, Jingyu Niu, Zhiping Shi, Yinhe Han, Xiaowei Li |
Abstract | Video prediction is a pixel-wise dense prediction task to infer future frames based on past frames. Missing appearance details and motion blur are still two major problems for current predictive models, which lead to image distortion and temporal inconsistency. In this paper, we point out the necessity of exploring multi-frequency analysis to deal with the two problems. Inspired by the frequency band decomposition characteristic of Human Vision System (HVS), we propose a video prediction network based on multi-level wavelet analysis to deal with spatial and temporal information in a unified manner. Specifically, the multi-level spatial discrete wavelet transform decomposes each video frame into anisotropic sub-bands with multiple frequencies, helping to enrich structural information and reserve fine details. On the other hand, multi-level temporal discrete wavelet transform which operates on time axis decomposes the frame sequence into sub-band groups of different frequencies to accurately capture multi-frequency motions under a fixed frame rate. Extensive experiments on diverse datasets demonstrate that our model shows significant improvements on fidelity and temporal consistency over state-of-the-art works. |
Tasks | Video Prediction |
Published | 2020-02-23 |
URL | https://arxiv.org/abs/2002.09905v1 |
https://arxiv.org/pdf/2002.09905v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-spatial-temporal-multi-frequency |
Repo | |
Framework | |
VisuoSpatial Foresight for Multi-Step, Multi-Task Fabric Manipulation
Title | VisuoSpatial Foresight for Multi-Step, Multi-Task Fabric Manipulation |
Authors | Ryan Hoque, Daniel Seita, Ashwin Balakrishna, Aditya Ganapathi, Ajay Kumar Tanwani, Nawid Jamali, Katsu Yamane, Soshi Iba, Ken Goldberg |
Abstract | Robotic fabric manipulation has applications in cloth and cable management, senior care, surgery and more. Existing fabric manipulation techniques, however, are designed for specific tasks, making it difficult to generalize across different but related tasks. We address this problem by extending the recently proposed Visual Foresight framework to learn fabric dynamics, which can be efficiently reused to accomplish a variety of different fabric manipulation tasks with a single goal-conditioned policy. We introduce VisuoSpatial Foresight (VSF), which extends prior work by learning visual dynamics on domain randomized RGB images and depth maps simultaneously and completely in simulation. We experimentally evaluate VSF on multi-step fabric smoothing and folding tasks both in simulation and on the da Vinci Research Kit (dVRK) surgical robot without any demonstrations at train or test time. Furthermore, we find that leveraging depth significantly improves performance for cloth manipulation tasks, and results suggest that leveraging RGBD data for video prediction and planning yields an 80% improvement in fabric folding success rate over pure RGB data. Supplementary material is available at https://sites.google.com/view/fabric-vsf/. |
Tasks | Video Prediction |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.09044v1 |
https://arxiv.org/pdf/2003.09044v1.pdf | |
PWC | https://paperswithcode.com/paper/visuospatial-foresight-for-multi-step-multi |
Repo | |
Framework | |