Paper Group ANR 1212
Remembering Winter Was Coming: Character-Oriented Video Summaries of TV Series. Conditional Denoising of Remote Sensing Imagery Using Cycle-Consistent Deep Generative Models. Consistent Classification with Generalized Metrics. On the Dimensionality of Embeddings for Sparse Features and Data. Model Compression with Two-stage Multi-teacher Knowledge …
Remembering Winter Was Coming: Character-Oriented Video Summaries of TV Series
Title | Remembering Winter Was Coming: Character-Oriented Video Summaries of TV Series |
Authors | Xavier Bost, Serigne Gueye, Vincent Labatut, Martha Larson, Georges Linarès, Damien Malinas, Raphaël Roth |
Abstract | Today’s popular TV series tend to develop continuous, complex plots spanning several seasons, but are often viewed in controlled and discontinuous conditions. Consequently, most viewers need to be re-immersed in the story before watching a new season. Although discussions with friends and family can help, we observe that most viewers make extensive use of summaries to re-engage with the plot. Automatic generation of video summaries of TV series’ complex stories requires, first, modeling the dynamics of the plot and, second, extracting relevant sequences. In this paper, we tackle plot modeling by considering the social network of interactions between the characters involved in the narrative: substantial, durable changes in a major character’s social environment suggest a new development relevant for the summary. Once identified, these major stages in each character’s storyline can be used as a basis for completing the summary with related sequences. Our algorithm combines such social network analysis with filmmaking grammar to automatically generate character-oriented video summaries of TV series from partially annotated data. We carry out evaluation with a user study in a real-world scenario: a large sample of viewers were asked to rank video summaries centered on five characters of the popular TV series Game of Thrones, a few weeks before the new, sixth season was released. Our results reveal the ability of character-oriented summaries to re-engage viewers in television series and confirm the contributions of modeling the plot content and exploiting stylistic patterns to identify salient sequences. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02423v3 |
https://arxiv.org/pdf/1909.02423v3.pdf | |
PWC | https://paperswithcode.com/paper/remembering-winter-was-coming |
Repo | |
Framework | |
Conditional Denoising of Remote Sensing Imagery Using Cycle-Consistent Deep Generative Models
Title | Conditional Denoising of Remote Sensing Imagery Using Cycle-Consistent Deep Generative Models |
Authors | Michael Zotov, Jevgenij Gamper |
Abstract | The potential of using remote sensing imagery for environmental modelling and for providing real time support to humanitarian operations such as hurricane relief efforts is well established. These applications are substantially affected by missing data due to non-structural noise such as clouds, shadows and other atmospheric effects. In this work we probe the potential of applying a cycle-consistent latent variable deep generative model (DGM) for denoising cloudy Sentinel-2 observations conditioned on the information in cloud penetrating bands. We adapt the recently proposed Fr'{e}chet Distance metric to remote sensing images for evaluating performance of the generator, demonstrate the potential of DGMs for conditional denoising, and discuss future directions as well as the limitations of DGMs in Earth science and humanitarian applications. |
Tasks | Denoising |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14567v1 |
https://arxiv.org/pdf/1910.14567v1.pdf | |
PWC | https://paperswithcode.com/paper/conditional-denoising-of-remote-sensing |
Repo | |
Framework | |
Consistent Classification with Generalized Metrics
Title | Consistent Classification with Generalized Metrics |
Authors | Xiaoyan Wang, Ran Li, Bowei Yan, Oluwasanmi Koyejo |
Abstract | We propose a framework for constructing and analyzing multiclass and multioutput classification metrics, i.e., involving multiple, possibly correlated multiclass labels. Our analysis reveals novel insights on the geometry of feasible confusion tensors – including necessary and sufficient conditions for the equivalence between optimizing an arbitrary non-decomposable metric and learning a weighted classifier. Further, we analyze averaging methodologies commonly used to compute multioutput metrics and characterize the corresponding Bayes optimal classifiers. We show that the plug-in estimator based on this characterization is consistent and is easily implemented as a post-processing rule. Empirical results on synthetic and benchmark datasets support the theoretical findings. |
Tasks | |
Published | 2019-08-24 |
URL | https://arxiv.org/abs/1908.09057v1 |
https://arxiv.org/pdf/1908.09057v1.pdf | |
PWC | https://paperswithcode.com/paper/consistent-classification-with-generalized |
Repo | |
Framework | |
On the Dimensionality of Embeddings for Sparse Features and Data
Title | On the Dimensionality of Embeddings for Sparse Features and Data |
Authors | Maxim Naumov |
Abstract | In this note we discuss a common misconception, namely that embeddings are always used to reduce the dimensionality of the item space. We show that when we measure dimensionality in terms of information entropy then the embedding of sparse probability distributions, that can be used to represent sparse features or data, may or not reduce the dimensionality of the item space. However, the embeddings do provide a different and often more meaningful representation of the items for a particular task at hand. Also, we give upper bounds and more precise guidelines for choosing the embedding dimension. |
Tasks | |
Published | 2019-01-07 |
URL | http://arxiv.org/abs/1901.02103v1 |
http://arxiv.org/pdf/1901.02103v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-dimensionality-of-embeddings-for |
Repo | |
Framework | |
Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System
Title | Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System |
Authors | Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang |
Abstract | Deep pre-training and fine-tuning models (such as BERT and OpenAI GPT) have demonstrated excellent results in question answering areas. However, due to the sheer amount of model parameters, the inference speed of these models is very slow. How to apply these complex models to real business scenarios becomes a challenging but practical problem. Previous model compression methods usually suffer from information loss during the model compression procedure, leading to inferior models compared with the original one. To tackle this challenge, we propose a Two-stage Multi-teacher Knowledge Distillation (TMKD for short) method for web Question Answering system. We first develop a general Q&A distillation task for student model pre-training, and further fine-tune this pre-trained student model with multi-teacher knowledge distillation on downstream tasks (like Web Q&A task, MNLI, SNLI, RTE tasks from GLUE), which effectively reduces the overfitting bias in individual teacher models, and transfers more general knowledge to the student model. The experiment results show that our method can significantly outperform the baseline methods and even achieve comparable results with the original teacher models, along with substantial speedup of model inference. |
Tasks | Model Compression, Question Answering |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08381v1 |
https://arxiv.org/pdf/1910.08381v1.pdf | |
PWC | https://paperswithcode.com/paper/model-compression-with-two-stage-multi |
Repo | |
Framework | |
Feature Pyramid Hashing
Title | Feature Pyramid Hashing |
Authors | Yifan Yang, Libing Geng, Hanjiang Lai, Yan Pan, Jian Yin |
Abstract | In recent years, deep-networks-based hashing has become a leading approach for large-scale image retrieval. Most deep hashing approaches use the high layer to extract the powerful semantic representations. However, these methods have limited ability for fine-grained image retrieval because the semantic features extracted from the high layer are difficult in capturing the subtle differences. To this end, we propose a novel two-pyramid hashing architecture to learn both the semantic information and the subtle appearance details for fine-grained image search. Inspired by the feature pyramids of convolutional neural network, a vertical pyramid is proposed to capture the high-layer features and a horizontal pyramid combines multiple low-layer features with structural information to capture the subtle differences. To fuse the low-level features, a novel combination strategy, called consensus fusion, is proposed to capture all subtle information from several low-layers for finer retrieval. Extensive evaluation on two fine-grained datasets CUB-200-2011 and Stanford Dogs demonstrate that the proposed method achieves significant performance compared with the state-of-art baselines. |
Tasks | Image Retrieval |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02325v1 |
http://arxiv.org/pdf/1904.02325v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-pyramid-hashing |
Repo | |
Framework | |
Learning Mutually Local-global U-nets For High-resolution Retinal Lesion Segmentation in Fundus Images
Title | Learning Mutually Local-global U-nets For High-resolution Retinal Lesion Segmentation in Fundus Images |
Authors | Zizheng Yan, Xiaoguang Han, Changmiao Wang, Yuda Qiu, Zixiang Xiong, Shuguang Cui |
Abstract | Diabetic retinopathy is the most important complication of diabetes. Early diagnosis of retinal lesions helps to avoid visual loss or blindness. Due to high-resolution and small-size lesion regions, applying existing methods, such as U-Nets, to perform segmentation on fundus photography is very challenging. Although downsampling the input images could simplify the problem, it loses detailed information. Conducting patch-level analysis helps reaching fine-scale segmentation yet usually leads to misunderstanding as the lack of context information. In this paper, we propose an efficient network that combines them together, not only being aware of local details but also taking fully use of the context perceptions. This is implemented by integrating the decoder parts of a global-level U-net and a patch-level one. The two streams are jointly optimized, ensuring that they are enhanced mutually. Experimental results demonstrate our new framework significantly outperforms existing patch-based and global-based methods, especially when the lesion regions are scattered and small-scaled. |
Tasks | Lesion Segmentation |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06047v1 |
http://arxiv.org/pdf/1901.06047v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-mutually-local-global-u-nets-for |
Repo | |
Framework | |
Fine-Grained Entity Typing for Domain Independent Entity Linking
Title | Fine-Grained Entity Typing for Domain Independent Entity Linking |
Authors | Yasumasa Onoe, Greg Durrett |
Abstract | Neural entity linking models are very powerful, but run the risk of overfitting to the domain they are trained in. For this problem, a domain is characterized not just by genre of text but even by factors as specific as the particular distribution of entities, as neural models tend to overfit by memorizing properties of frequent entities in a dataset. We tackle the problem of building robust entity linking models that generalize effectively and do not rely on labeled entity linking data with a specific entity distribution. Rather than predicting entities directly, our approach models fine-grained entity properties, which can help disambiguate between even closely related entities. We derive a large inventory of types (tens of thousands) from Wikipedia categories, and use hyperlinked mentions in Wikipedia to distantly label data and train an entity typing model. At test time, we classify a mention with this typing model and use soft type predictions to link the mention to the most similar candidate entity. We evaluate our entity linking system on the CoNLL-YAGO dataset (Hoffart et al., 2011) and show that our approach outperforms prior domain-independent entity linking systems. We also test our approach in a harder setting derived from the WikilinksNED dataset (Eshel et al., 2017) where all the mention-entity pairs are unseen during test time. Results indicate that our approach generalizes better than a state-of-the-art neural model on the dataset. |
Tasks | Entity Linking, Entity Typing |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05780v2 |
https://arxiv.org/pdf/1909.05780v2.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-entity-typing-for-domain |
Repo | |
Framework | |
SuperChat: Dialogue Generation by Transfer Learning from Vision to Language using Two-dimensional Word Embedding and Pretrained ImageNet CNN Models
Title | SuperChat: Dialogue Generation by Transfer Learning from Vision to Language using Two-dimensional Word Embedding and Pretrained ImageNet CNN Models |
Authors | Baohua Sun, Lin Yang, Michael Lin, Charles Young, Jason Dong, Wenhan Zhang, Patrick Dong |
Abstract | The recent work of Super Characters method using two-dimensional word embedding achieved state-of-the-art results in text classification tasks, showcasing the promise of this new approach. This paper borrows the idea of Super Characters method and two-dimensional embedding, and proposes a method of generating conversational response for open domain dialogues. The experimental results on a public dataset shows that the proposed SuperChat method generates high quality responses. An interactive demo is ready to show at the workshop. |
Tasks | Dialogue Generation, Text Classification, Transfer Learning |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.05698v2 |
https://arxiv.org/pdf/1905.05698v2.pdf | |
PWC | https://paperswithcode.com/paper/190505698 |
Repo | |
Framework | |
Learning Compositional Koopman Operators for Model-Based Control
Title | Learning Compositional Koopman Operators for Model-Based Control |
Authors | Yunzhu Li, Hao He, Jiajun Wu, Dina Katabi, Antonio Torralba |
Abstract | Finding an embedding space for a linear approximation of a nonlinear dynamical system enables efficient system identification and control synthesis. The Koopman operator theory lays the foundation for identifying the nonlinear-to-linear coordinate transformations with data-driven methods. Recently, researchers have proposed to use deep neural networks as a more expressive class of basis functions for calculating the Koopman operators. These approaches, however, assume a fixed dimensional state space; they are therefore not applicable to scenarios with a variable number of objects. In this paper, we propose to learn compositional Koopman operators, using graph neural networks to encode the state into object-centric embeddings and using a block-wise linear transition matrix to regularize the shared structure across objects. The learned dynamics can quickly adapt to new environments of unknown physical parameters and produce control signals to achieve a specified goal. Our experiments on manipulating ropes and controlling soft robots show that the proposed method has better efficiency and generalization ability than existing baselines. |
Tasks | |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08264v1 |
https://arxiv.org/pdf/1910.08264v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-compositional-koopman-operators-for |
Repo | |
Framework | |
Learning from satisfying assignments under continuous distributions
Title | Learning from satisfying assignments under continuous distributions |
Authors | Clément L. Canonne, Anindya De, Rocco A. Servedio |
Abstract | What kinds of functions are learnable from their satisfying assignments? Motivated by this simple question, we extend the framework of De, Diakonikolas, and Servedio [DDS15], which studied the learnability of probability distributions over ${0,1}^n$ defined by the set of satisfying assignments to “low-complexity” Boolean functions, to Boolean-valued functions defined over continuous domains. In our learning scenario there is a known “background distribution” $\mathcal{D}$ over $\mathbb{R}^n$ (such as a known normal distribution or a known log-concave distribution) and the learner is given i.i.d. samples drawn from a target distribution $\mathcal{D}_f$, where $\mathcal{D}_f$ is $\mathcal{D}$ restricted to the satisfying assignments of an unknown low-complexity Boolean-valued function $f$. The problem is to learn an approximation $\mathcal{D}‘$ of the target distribution $\mathcal{D}_f$ which has small error as measured in total variation distance. We give a range of efficient algorithms and hardness results for this problem, focusing on the case when $f$ is a low-degree polynomial threshold function (PTF). When the background distribution $\mathcal{D}$ is log-concave, we show that this learning problem is efficiently solvable for degree-1 PTFs (i.e.,~linear threshold functions) but not for degree-2 PTFs. In contrast, when $\mathcal{D}$ is a normal distribution, we show that this learning problem is efficiently solvable for degree-2 PTFs but not for degree-4 PTFs. Our hardness results rely on standard assumptions about secure signature schemes. |
Tasks | |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01619v1 |
https://arxiv.org/pdf/1907.01619v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-satisfying-assignments-under |
Repo | |
Framework | |
A frequency-domain analysis of inexact gradient descent
Title | A frequency-domain analysis of inexact gradient descent |
Authors | Oran Gannot |
Abstract | We study robustness properties of inexact gradient descent for strongly convex functions, as well as for the larger class of functions with sector-bounded gradients, under a relative error model. Proofs of the corresponding convergence rates are based on frequency-domain criteria for the stability of nonlinear systems perturbed by additive noise. |
Tasks | |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/1912.13494v1 |
https://arxiv.org/pdf/1912.13494v1.pdf | |
PWC | https://paperswithcode.com/paper/a-frequency-domain-analysis-of-inexact |
Repo | |
Framework | |
Latent Space Factorisation and Manipulation via Matrix Subspace Projection
Title | Latent Space Factorisation and Manipulation via Matrix Subspace Projection |
Authors | Xiao Li, Chenghua Lin, Chaozheng Wang, Frank Guerin |
Abstract | This paper proposes a novel method for factorising the information in the latent space of an autoencoder (AE), to improve the interpretability of the latent space and facilitate controlled generation. When trained on a dataset with labelled attributes we can produce a latent vector which separates information encoding the attributes from other characteristic information, and also disentangles the attribute information. This then allows us to manipulate each attribute of the latent representation individually without affecting others. Our method, matrix subspace projection, is simpler than the state of the art adversarial network approaches to latent space factorisation. We demonstrate the utility of the method for attribute manipulation tasks on the CelebA image dataset and the E2E text corpus. |
Tasks | |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.12385v1 |
https://arxiv.org/pdf/1907.12385v1.pdf | |
PWC | https://paperswithcode.com/paper/latent-space-factorisation-and-manipulation |
Repo | |
Framework | |
The Topology of Mutated Driver Pathways
Title | The Topology of Mutated Driver Pathways |
Authors | Raouf Dridi, Hedayat Alghassi, Maen Obeidat, Sridhar Tayur |
Abstract | Much progress has been made, and continues to be made, towards identifying candidate mutated driver pathways in cancer. However, no systematic approach to understanding how candidate pathways relate to each other for a given cancer (such as Acute myeloid leukemia), and how one type of cancer may be similar or different from another with regard to their respective pathways (Acute myeloid leukemia vs. Glioblastoma multiforme for instance), has emerged thus far. Our work attempts to contribute to the understanding of {\em space of pathways} through a novel topological framework. We illustrate our approach, using mutation data (obtained from TCGA) of two types of tumors: Acute myeloid leukemia (AML) and Glioblastoma multiforme (GBM). We find that the space of pathways for AML is homotopy equivalent to a sphere, while that of GBM is equivalent to a genus-2 surface. We hope to trigger new types of questions (i.e., allow for novel kinds of hypotheses) towards a more comprehensive grasp of cancer. |
Tasks | |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.00108v1 |
https://arxiv.org/pdf/1912.00108v1.pdf | |
PWC | https://paperswithcode.com/paper/the-topology-of-mutated-driver-pathways |
Repo | |
Framework | |
A Novel Modeling Approach for All-Dielectric Metasurfaces Using Deep Neural Networks
Title | A Novel Modeling Approach for All-Dielectric Metasurfaces Using Deep Neural Networks |
Authors | Sensong An, Clayton Fowler, Bowen Zheng, Mikhail Y. Shalaginov, Hong Tang, Hang Li, Li Zhou, Jun Ding, Anuradha Murthy Agarwal, Clara Rivero-Baleine, Kathleen A. Richardson, Tian Gu, Juejun Hu, Hualiang Zhang |
Abstract | Metasurfaces have become a promising means for manipulating optical wavefronts in flat and high-performance optical devices. Conventional metasurface device design relies on trial-and-error methods to obtain target electromagnetic (EM) response, an approach that demands significant efforts to investigate the enormous number of possible meta-atom structures. In this paper, a deep neural network approach is introduced that significantly improves on both speed and accuracy compared to techniques currently used to assemble metasurface-based devices. Our neural network approach overcomes three key challenges that have limited previous neural-network-based design schemes: input/output vector dimensional mismatch, accurate EM-wave phase prediction, as well as adaptation to 3-D dielectric structures, and can be generically applied to a wide variety of metasurface device designs across the entire electromagnetic spectrum. Using this new methodology, examples of neural networks capable of producing on-demand designs for meta-atoms, metasurface filters, and phase-change reconfigurable metasurfaces are demonstrated. |
Tasks | |
Published | 2019-06-08 |
URL | https://arxiv.org/abs/1906.03387v1 |
https://arxiv.org/pdf/1906.03387v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-modeling-approach-for-all-dielectric |
Repo | |
Framework | |