Paper Group ANR 956
Hyperbolic Deep Learning for Chinese Natural Language Understanding. Domain Confusion with Self Ensembling for Unsupervised Adaptation. Social Media Analysis For Organizations: Us Northeastern Public And State Libraries Case Study. RIFT: Multi-modal Image Matching Based on Radiation-invariant Feature Transform. Semi-supervised classification by rea …
Hyperbolic Deep Learning for Chinese Natural Language Understanding
Title | Hyperbolic Deep Learning for Chinese Natural Language Understanding |
Authors | Marko Valentin Micic, Hugo Chu |
Abstract | Recently hyperbolic geometry has proven to be effective in building embeddings that encode hierarchical and entailment information. This makes it particularly suited to modelling the complex asymmetrical relationships between Chinese characters and words. In this paper we first train a large scale hyperboloid skip-gram model on a Chinese corpus, then apply the character embeddings to a downstream hyperbolic Transformer model derived from the principles of gyrovector space for Poincare disk model. In our experiments the character-based Transformer outperformed its word-based Euclidean equivalent. To the best of our knowledge, this is the first time in Chinese NLP that a character-based model outperformed its word-based counterpart, allowing the circumvention of the challenging and domain-dependent task of Chinese Word Segmentation (CWS). |
Tasks | Chinese Word Segmentation |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.10408v1 |
http://arxiv.org/pdf/1812.10408v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperbolic-deep-learning-for-chinese-natural |
Repo | |
Framework | |
Domain Confusion with Self Ensembling for Unsupervised Adaptation
Title | Domain Confusion with Self Ensembling for Unsupervised Adaptation |
Authors | Jiawei Wang, Zhaoshui He, Chengjian Feng, Zhouping Zhu, Qinzhuang Lin, Jun Lv, Shengli Xie |
Abstract | Data collection and annotation are time-consuming in machine learning, expecially for large scale problem. A common approach for this problem is to transfer knowledge from a related labeled domain to a target one. There are two popular ways to achieve this goal: adversarial learning and self training. In this article, we first analyze the training unstablity problem and the mistaken confusion issue in adversarial learning process. Then, inspired by domain confusion and self-ensembling methods, we propose a combined model to learn feature and class jointly invariant representation, namely Domain Confusion with Self Ensembling (DCSE). The experiments verified that our proposed approach can offer better performance than empirical art in a variety of unsupervised domain adaptation benchmarks. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04472v1 |
http://arxiv.org/pdf/1810.04472v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-confusion-with-self-ensembling-for |
Repo | |
Framework | |
Social Media Analysis For Organizations: Us Northeastern Public And State Libraries Case Study
Title | Social Media Analysis For Organizations: Us Northeastern Public And State Libraries Case Study |
Authors | Matthew Collins, Amir Karami |
Abstract | Social networking sites such as Twitter have provided a great opportunity for organizations such as public libraries to disseminate information for public relations purposes. However, there is a need to analyze vast amounts of social media data. This study presents a computational approach to explore the content of tweets posted by nine public libraries in the northeastern United States of America. In December 2017, this study extracted more than 19,000 tweets from the Twitter accounts of seven state libraries and two urban public libraries. Computational methods were applied to collect the tweets and discover meaningful themes. This paper shows how the libraries have used Twitter to represent their services and provides a starting point for different organizations to evaluate the themes of their public tweets. |
Tasks | |
Published | 2018-03-24 |
URL | http://arxiv.org/abs/1803.09133v1 |
http://arxiv.org/pdf/1803.09133v1.pdf | |
PWC | https://paperswithcode.com/paper/social-media-analysis-for-organizations-us |
Repo | |
Framework | |
RIFT: Multi-modal Image Matching Based on Radiation-invariant Feature Transform
Title | RIFT: Multi-modal Image Matching Based on Radiation-invariant Feature Transform |
Authors | Jiayuan Li, Qingwu Hu, Mingyao Ai |
Abstract | Traditional feature matching methods such as scale-invariant feature transform (SIFT) usually use image intensity or gradient information to detect and describe feature points; however, both intensity and gradient are sensitive to nonlinear radiation distortions (NRD). To solve the problem, this paper proposes a novel feature matching algorithm that is robust to large NRD. The proposed method is called radiation-invariant feature transform (RIFT). There are three main contributions in RIFT: first, RIFT uses phase congruency (PC) instead of image intensity for feature point detection. RIFT considers both the number and repeatability of feature points, and detects both corner points and edge points on the PC map. Second, RIFT originally proposes a maximum index map (MIM) for feature description. MIM is constructed from the log-Gabor convolution sequence and is much more robust to NRD than traditional gradient map. Thus, RIFT not only largely improves the stability of feature detection, but also overcomes the limitation of gradient information for feature description. Third, RIFT analyzes the inherent influence of rotations on the values of MIM, and realizes rotation invariance. We use six different types of multi-model image datasets to evaluate RIFT, including optical-optical, infrared-optical, synthetic aperture radar (SAR)-optical, depth-optical, map-optical, and day-night datasets. Experimental results show that RIFT is much more superior to SIFT and SAR-SIFT. To the best of our knowledge, RIFT is the first feature matching algorithm that can achieve good performance on all the above-mentioned types of multi-model images. The source code of RIFT and multi-modal remote sensing image datasets are made public . |
Tasks | |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09493v1 |
http://arxiv.org/pdf/1804.09493v1.pdf | |
PWC | https://paperswithcode.com/paper/rift-multi-modal-image-matching-based-on |
Repo | |
Framework | |
Semi-supervised classification by reaching consensus among modalities
Title | Semi-supervised classification by reaching consensus among modalities |
Authors | Zining Zhu, Jekaterina Novikova, Frank Rudzicz |
Abstract | Deep learning has demonstrated abilities to learn complex structures, but they can be restricted by available data. Recently, Consensus Networks (CNs) were proposed to alleviate data sparsity by utilizing features from multiple modalities, but they too have been limited by the size of labeled data. In this paper, we extend CN to Transductive Consensus Networks (TCNs), suitable for semi-supervised learning. In TCNs, different modalities of input are compressed into latent representations, which we encourage to become indistinguishable during iterative adversarial training. To understand TCNs two mechanisms, consensus and classification, we put forward its three variants in ablation studies on these mechanisms. To further investigate TCN models, we treat the latent representations as probability distributions and measure their similarities as the negative relative Jensen-Shannon divergences. We show that a consensus state beneficial for classification desires a stable but imperfect similarity between the representations. Overall, TCNs outperform or align with the best benchmark algorithms given 20 to 200 labeled samples on the Bank Marketing and the DementiaBank datasets. |
Tasks | |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.09366v2 |
http://arxiv.org/pdf/1805.09366v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-classification-by-reaching |
Repo | |
Framework | |
Negative Binomial Matrix Factorization for Recommender Systems
Title | Negative Binomial Matrix Factorization for Recommender Systems |
Authors | Olivier Gouvert, Thomas Oberlin, Cédric Févotte |
Abstract | We introduce negative binomial matrix factorization (NBMF), a matrix factorization technique specially designed for analyzing over-dispersed count data. It can be viewed as an extension of Poisson matrix factorization (PF) perturbed by a multiplicative term which models exposure. This term brings a degree of freedom for controlling the dispersion, making NBMF more robust to outliers. We show that NBMF allows to skip traditional pre-processing stages, such as binarization, which lead to loss of information. Two estimation approaches are presented: maximum likelihood and variational Bayes inference. We test our model with a recommendation task and show its ability to predict user tastes with better precision than PF. |
Tasks | Recommendation Systems |
Published | 2018-01-05 |
URL | http://arxiv.org/abs/1801.01708v1 |
http://arxiv.org/pdf/1801.01708v1.pdf | |
PWC | https://paperswithcode.com/paper/negative-binomial-matrix-factorization-for |
Repo | |
Framework | |
TGANv2: Efficient Training of Large Models for Video Generation with Multiple Subsampling Layers
Title | TGANv2: Efficient Training of Large Models for Video Generation with Multiple Subsampling Layers |
Authors | Masaki Saito, Shunta Saito |
Abstract | In this paper, we propose a novel method to efficiently train a Generative Adversarial Network (GAN) on high dimensional samples. The key idea is to introduce a differentiable subsampling layer which appropriately reduces the dimensionality of intermediate feature maps in the generator during training. In general, generators require large memory and computational costs in the latter stages of the network as the feature maps become larger, though the latter stages have relatively fewer parameters than the earlier stages. It makes training large models for video generation difficult due to the limited computational resource. We solve this problem by introducing a method that gradually reduces the dimensionality of feature maps in the generator with multiple subsampling layers. We also propose a network (Temporal GAN v2) with such layers and perform video generation experiments. As a consequence, our model trained on the UCF101 dataset at $192 \times 192$ pixels achieves an Inception Score (IS) of 24.34, which shows a significant improvement over the previous state-of-the-art score of 14.56. |
Tasks | Video Generation |
Published | 2018-11-22 |
URL | http://arxiv.org/abs/1811.09245v1 |
http://arxiv.org/pdf/1811.09245v1.pdf | |
PWC | https://paperswithcode.com/paper/tganv2-efficient-training-of-large-models-for |
Repo | |
Framework | |
Fully Automatic Segmentation of Sublingual Veins from Retrained U-Net Model for Few Near Infrared Images
Title | Fully Automatic Segmentation of Sublingual Veins from Retrained U-Net Model for Few Near Infrared Images |
Authors | Tingxiao Yang, Yuichiro Yoshimura, Akira Morita, Takao Namiki, Toshiya Nakaguchi |
Abstract | Sublingual vein is commonly used to diagnose the health status. The width of main sublingual veins gives information of the blood circulation. Therefore, it is necessary to segment the main sublingual veins from the tongue automatically. In general, the dataset in the medical field is small, which is a challenge for training the deep learning model. In order to train the model with a small data set, the proposed method for automatically segmenting the sublingual veins is to re-train U-net model with different sets of the limited number of labels for the same training images. With pre-knowledge of the segmentation, the loss of the trained model will be convergence easier. To improve the performance of the segmentation further, a novel strategy of data augmentation was utilized. The operation for masking output of the model with the input was randomly switched on or switched off in each training step. This approach will force the model to learn the contrast invariance and avoid overfitting. Images of dataset were taken with the developed device using eight near infrared LEDs. The final segmentation results were evaluated on the validation dataset by the IoU metric. |
Tasks | Data Augmentation |
Published | 2018-12-22 |
URL | http://arxiv.org/abs/1812.09477v1 |
http://arxiv.org/pdf/1812.09477v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-automatic-segmentation-of-sublingual |
Repo | |
Framework | |
Kinematic Morphing Networks for Manipulation Skill Transfer
Title | Kinematic Morphing Networks for Manipulation Skill Transfer |
Authors | Peter Englert, Marc Toussaint |
Abstract | The transfer of a robot skill between different geometric environments is non-trivial since a wide variety of environments exists, sensor observations as well as robot motions are high-dimensional, and the environment might only be partially observed. We consider the problem of extracting a low-dimensional description of the manipulated environment in form of a kinematic model. This allows us to transfer a skill by defining a policy on a prototype model and morphing the observed environment to this prototype. A deep neural network is used to map depth image observations of the environment to morphing parameter, which include transformation and configuration parameters of the prototype model. Using the concatenation property of affine transformations and the ability to convert point clouds to depth images allows to apply the network in an iterative manner. The network is trained on data generated in a simulator and on augmented data that is created by using network predictions. The algorithm is evaluated on different tasks, where it is shown that iterative predictions lead to a higher accuracy than one-step predictions. |
Tasks | |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01777v1 |
http://arxiv.org/pdf/1803.01777v1.pdf | |
PWC | https://paperswithcode.com/paper/kinematic-morphing-networks-for-manipulation |
Repo | |
Framework | |
3D Human Pose Estimation with Relational Networks
Title | 3D Human Pose Estimation with Relational Networks |
Authors | Sungheon Park, Nojun Kwak |
Abstract | In this paper, we propose a novel 3D human pose estimation algorithm from a single image based on neural networks. We adopted the structure of the relational networks in order to capture the relations among different body parts. In our method, each pair of different body parts generates features, and the average of the features from all the pairs are used for 3D pose estimation. In addition, we propose a dropout method that can be used in relational modules, which inherently imposes robustness to the occlusions. The proposed network achieves state-of-the-art performance for 3D pose estimation in Human 3.6M dataset, and it effectively produces plausible results even in the existence of missing joints. |
Tasks | 3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.08961v2 |
http://arxiv.org/pdf/1805.08961v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-human-pose-estimation-with-relational |
Repo | |
Framework | |
A Unified Framework for Training Neural Networks
Title | A Unified Framework for Training Neural Networks |
Authors | Hadi Ghauch, Hossein Shokri-Ghadikolaei, Carlo Fischione, Mikael Skoglund |
Abstract | The lack of mathematical tractability of Deep Neural Networks (DNNs) has hindered progress towards having a unified convergence analysis of training algorithms, in the general setting. We propose a unified optimization framework for training different types of DNNs, and establish its convergence for arbitrary loss, activation, and regularization functions, assumed to be smooth. We show that framework generalizes well-known first- and second-order training methods, and thus allows us to show the convergence of these methods for various DNN architectures and learning tasks, as a special case of our approach. We discuss some of its applications in training various DNN architectures (e.g., feed-forward, convolutional, linear networks), to regression and classification tasks. |
Tasks | |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.09214v1 |
http://arxiv.org/pdf/1805.09214v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-for-training-neural |
Repo | |
Framework | |
PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities
Title | PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities |
Authors | Lan Wang, Chenqiang Gao, Luyu Yang, Yue Zhao, Wangmeng Zuo, Deyu Meng |
Abstract | Data of different modalities generally convey complimentary but heterogeneous information, and a more discriminative representation is often preferred by combining multiple data modalities like the RGB and infrared features. However in reality, obtaining both data channels is challenging due to many limitations. For example, the RGB surveillance cameras are often restricted from private spaces, which is in conflict with the need of abnormal activity detection for personal security. As a result, using partial data channels to build a full representation of multi-modalities is clearly desired. In this paper, we propose a novel Partial-modal Generative Adversarial Networks (PM-GANs) that learns a full-modal representation using data from only partial modalities. The full representation is achieved by a generated representation in place of the missing data channel. Extensive experiments are conducted to verify the performance of our proposed method on action recognition, compared with four state-of-the-art methods. Meanwhile, a new Infrared-Visible Dataset for action recognition is introduced, and will be the first publicly available action dataset that contains paired infrared and visible spectrum. |
Tasks | Action Detection, Activity Detection, Representation Learning, Temporal Action Localization |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06248v1 |
http://arxiv.org/pdf/1804.06248v1.pdf | |
PWC | https://paperswithcode.com/paper/pm-gans-discriminative-representation |
Repo | |
Framework | |
Tight Query Complexity Lower Bounds for PCA via Finite Sample Deformed Wigner Law
Title | Tight Query Complexity Lower Bounds for PCA via Finite Sample Deformed Wigner Law |
Authors | Max Simchowitz, Ahmed El Alaoui, Benjamin Recht |
Abstract | We prove a \emph{query complexity} lower bound for approximating the top $r$ dimensional eigenspace of a matrix. We consider an oracle model where, given a symmetric matrix $\mathbf{M} \in \mathbb{R}^{d \times d}$, an algorithm $\mathsf{Alg}$ is allowed to make $\mathsf{T}$ exact queries of the form $\mathsf{w}^{(i)} = \mathbf{M} \mathsf{v}^{(i)}$ for $i$ in ${1,…,\mathsf{T}}$, where $\mathsf{v}^{(i)}$ is drawn from a distribution which depends arbitrarily on the past queries and measurements ${\mathsf{v}^{(j)},\mathsf{w}^{(i)}}_{1 \le j \le i-1}$. We show that for every $\mathtt{gap} \in (0,1/2]$, there exists a distribution over matrices $\mathbf{M}$ for which 1) $\mathrm{gap}_r(\mathbf{M}) = \Omega(\mathtt{gap})$ (where $\mathrm{gap}r(\mathbf{M})$ is the normalized gap between the $r$ and $r+1$-st largest-magnitude eigenvector of $\mathbf{M}$), and 2) any algorithm $\mathsf{Alg}$ which takes fewer than $\mathrm{const} \times \frac{r \log d}{\sqrt{\mathtt{gap}}}$ queries fails (with overwhelming probability) to identity a matrix $\widehat{\mathsf{V}} \in \mathbb{R}^{d \times r}$ with orthonormal columns for which $\langle \widehat{\mathsf{V}}, \mathbf{M} \widehat{\mathsf{V}}\rangle \ge (1 - \mathrm{const} \times \mathtt{gap})\sum{i=1}^r \lambda_i(\mathbf{M})$. Our bound requires only that $d$ is a small polynomial in $1/\mathtt{gap}$ and $r$, and matches the upper bounds of Musco and Musco ‘15. Moreover, it establishes a strict separation between convex optimization and \emph{randomized}, “strict-saddle” non-convex optimization of which PCA is a canonical example: in the former, first-order methods can have dimension-free iteration complexity, whereas in PCA, the iteration complexity of gradient-based methods must necessarily grow with the dimension. |
Tasks | |
Published | 2018-04-04 |
URL | http://arxiv.org/abs/1804.01221v1 |
http://arxiv.org/pdf/1804.01221v1.pdf | |
PWC | https://paperswithcode.com/paper/tight-query-complexity-lower-bounds-for-pca |
Repo | |
Framework | |
Scalable Simple Linear Iterative Clustering (SSLIC) Using a Generic and Parallel Approach
Title | Scalable Simple Linear Iterative Clustering (SSLIC) Using a Generic and Parallel Approach |
Authors | Bradley C. Lowekamp, David T. Chen, Ziv Yaniv, Terry S. Yoo |
Abstract | Superpixel algorithms have proven to be a useful initial step for segmentation and subsequent processing of images, reducing computational complexity by replacing the use of expensive per-pixel primitives with a higher-level abstraction, superpixels. They have been successfully applied both in the context of traditional image analysis and deep learning based approaches. In this work, we present a generalized implementation of the simple linear iterative clustering (SLIC) superpixel algorithm that has been generalized for n-dimensional scalar and multi-channel images. Additionally, the standard iterative implementation is replaced by a parallel, multi-threaded one. We describe the implementation details and analyze its scalability using a strong scaling formulation. Quantitative evaluation is performed using a 3D image, the Visible Human cryosection dataset, and a 2D image from the same dataset. Results show good scalability with runtime gains even when using a large number of threads that exceeds the physical number of available cores (hyperthreading). |
Tasks | |
Published | 2018-06-22 |
URL | http://arxiv.org/abs/1806.08741v2 |
http://arxiv.org/pdf/1806.08741v2.pdf | |
PWC | https://paperswithcode.com/paper/scalable-simple-linear-iterative-clustering |
Repo | |
Framework | |
Explainable Recommendation via Multi-Task Learning in Opinionated Text Data
Title | Explainable Recommendation via Multi-Task Learning in Opinionated Text Data |
Authors | Nan Wang, Hongning Wang, Yiling Jia, Yue Yin |
Abstract | Explaining automatically generated recommendations allows users to make more informed and accurate decisions about which results to utilize, and therefore improves their satisfaction. In this work, we develop a multi-task learning solution for explainable recommendation. Two companion learning tasks of user preference modeling for recommendation} and \textit{opinionated content modeling for explanation are integrated via a joint tensor factorization. As a result, the algorithm predicts not only a user’s preference over a list of items, i.e., recommendation, but also how the user would appreciate a particular item at the feature level, i.e., opinionated textual explanation. Extensive experiments on two large collections of Amazon and Yelp reviews confirmed the effectiveness of our solution in both recommendation and explanation tasks, compared with several existing recommendation algorithms. And our extensive user study clearly demonstrates the practical value of the explainable recommendations generated by our algorithm. |
Tasks | Multi-Task Learning |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03568v1 |
http://arxiv.org/pdf/1806.03568v1.pdf | |
PWC | https://paperswithcode.com/paper/explainable-recommendation-via-multi-task |
Repo | |
Framework | |