Paper Group AWR 177
Embedding and learning with signatures. Modelling conditional probabilities with Riemann-Theta Boltzmann Machines. Machine Learning Prediction of Accurate Atomization Energies of Organic Molecules from Low-Fidelity Quantum Chemical Calculations. A Data Set of Internet Claims and Comparison of their Sentiments with Credibility. deepsing: Generating …
Embedding and learning with signatures
Title | Embedding and learning with signatures |
Authors | Adeline Fermanian |
Abstract | Sequential and temporal data arise in many fields of research, such as quantitative finance, medicine, or computer vision. The present article is concerned with a novel approach for sequential learning, called the signature method, and rooted in rough path theory. Its basic principle is to represent multidimensional paths by a graded feature set of their iterated integrals, called the signature. This approach relies critically on an embedding principle, which consists in representing discretely sampled data as paths, i.e., functions from $[0,1]$ to $R^d$. After a survey of machine learning methodologies for signatures, we investigate the influence of embeddings on prediction accuracy with an in-depth study of three recent and challenging datasets. We show that a specific embedding, called lead-lag, is systematically better, whatever the dataset or algorithm used. Moreover, we emphasize through an empirical study that computing signatures over the whole path domain does not lead to a loss of local information. We conclude that, with a good embedding, the signature combined with a simple algorithm achieves results competitive with state-of-the-art, domain-specific approaches. |
Tasks | |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13211v1 |
https://arxiv.org/pdf/1911.13211v1.pdf | |
PWC | https://paperswithcode.com/paper/embedding-and-learning-with-signatures |
Repo | https://github.com/afermanian/embedding_with_signatures |
Framework | none |
Modelling conditional probabilities with Riemann-Theta Boltzmann Machines
Title | Modelling conditional probabilities with Riemann-Theta Boltzmann Machines |
Authors | Stefano Carrazza, Daniel Krefl, Andrea Papaluca |
Abstract | The probability density function for the visible sector of a Riemann-Theta Boltzmann machine can be taken conditional on a subset of the visible units. We derive that the corresponding conditional density function is given by a reparameterization of the Riemann-Theta Boltzmann machine modelling the original probability density function. Therefore the conditional densities can be directly inferred from the Riemann-Theta Boltzmann machine. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11313v1 |
https://arxiv.org/pdf/1905.11313v1.pdf | |
PWC | https://paperswithcode.com/paper/modelling-conditional-probabilities-with |
Repo | https://github.com/BrunoLiegiBastonLiegi/RTBM-conditional |
Framework | none |
Machine Learning Prediction of Accurate Atomization Energies of Organic Molecules from Low-Fidelity Quantum Chemical Calculations
Title | Machine Learning Prediction of Accurate Atomization Energies of Organic Molecules from Low-Fidelity Quantum Chemical Calculations |
Authors | Logan Ward, Ben Blaiszik, Ian Foster, Rajeev S. Assary, Badri Narayanan, Larry Curtiss |
Abstract | Recent studies illustrate how machine learning (ML) can be used to bypass a core challenge of molecular modeling: the tradeoff between accuracy and computational cost. Here, we assess multiple ML approaches for predicting the atomization energy of organic molecules. Our resulting models learn the difference between low-fidelity, B3LYP, and high-accuracy, G4MP2, atomization energies, and predict the G4MP2 atomization energy to 0.005 eV (mean absolute error) for molecules with less than 9 heavy atoms and 0.012 eV for a small set of molecules with between 10 and 14 heavy atoms. Our two best models, which have different accuracy/speed tradeoffs, enable the efficient prediction of G4MP2-level energies for large molecules and are available through a simple web interface. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03233v1 |
https://arxiv.org/pdf/1906.03233v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-prediction-of-accurate |
Repo | https://github.com/globus-labs/g4mp2-atomization-energy |
Framework | none |
A Data Set of Internet Claims and Comparison of their Sentiments with Credibility
Title | A Data Set of Internet Claims and Comparison of their Sentiments with Credibility |
Authors | Amey Parundekar, Susan Elias, Ashwin Ashok |
Abstract | In this modern era, communication has become faster and easier. This means fallacious information can spread as fast as reality. Considering the damage that fake news kindles on the psychology of people and the fact that such news proliferates faster than truth, we need to study the phenomenon that helps spread fake news. An unbiased data set that depends on reality for rating news is necessary to construct predictive models for its classification. This paper describes the methodology to create such a data set. We collect our data from snopes.com which is a fact-checking organization. Furthermore, we intend to create this data set not only for classification of the news but also to find patterns that reason the intent behind misinformation. We also formally define an Internet Claim, its credibility, and the sentiment behind such a claim. We try to realize the relationship between the sentiment of a claim with its credibility. This relationship pours light on the bigger picture behind the propagation of misinformation. We pave the way for further research based on the methodology described in this paper to create the data set and usage of predictive modeling along with research-based on psychology/mentality of people to understand why fake news spreads much faster than reality. |
Tasks | |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1911.10130v1 |
https://arxiv.org/pdf/1911.10130v1.pdf | |
PWC | https://paperswithcode.com/paper/a-data-set-of-internet-claims-and-comparison |
Repo | https://github.com/the-lost-explorer/iClaimNet |
Framework | none |
deepsing: Generating Sentiment-aware Visual Stories using Cross-modal Music Translation
Title | deepsing: Generating Sentiment-aware Visual Stories using Cross-modal Music Translation |
Authors | Nikolaos Passalis, Stavros Doropoulos |
Abstract | In this paper we propose a deep learning method for performing attributed-based music-to-image translation. The proposed method is applied for synthesizing visual stories according to the sentiment expressed by songs. The generated images aim to induce the same feelings to the viewers, as the original song does, reinforcing the primary aim of music, i.e., communicating feelings. The process of music-to-image translation poses unique challenges, mainly due to the unstable mapping between the different modalities involved in this process. In this paper, we employ a trainable cross-modal translation method to overcome this limitation, leading to the first, to the best of our knowledge, deep learning method for generating sentiment-aware visual stories. Various aspects of the proposed method are extensively evaluated and discussed using different songs. |
Tasks | |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05654v1 |
https://arxiv.org/pdf/1912.05654v1.pdf | |
PWC | https://paperswithcode.com/paper/deepsing-generating-sentiment-aware-visual |
Repo | https://github.com/deepsing-ai/deepsing |
Framework | pytorch |
Phenotyping of Clinical Notes with Improved Document Classification Models Using Contextualized Neural Language Models
Title | Phenotyping of Clinical Notes with Improved Document Classification Models Using Contextualized Neural Language Models |
Authors | Andriy Mulyar, Elliot Schumacher, Masoud Rouhizadeh, Mark Dredze |
Abstract | Clinical notes contain an extensive record of a patient’s health status, such as smoking status or the presence of heart conditions. However, this detail is not replicated within the structured data of electronic health systems. Phenotyping, the extraction of patient conditions from free clinical text, is a critical task which supports avariety of downstream applications such as decision support and secondary use ofmedical records. Previous work has resulted in systems which are high performing but require hand engineering, often of rules. Recent work in pretrained contextualized language models have enabled advances in representing text for a variety of tasks. We therefore explore several architectures for modeling pheno-typing that rely solely on BERT representations of the clinical note, removing the need for manual engineering. We find these architectures are competitive with or outperform existing state of the art methods on two phenotyping tasks. |
Tasks | Document Classification |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13664v1 |
https://arxiv.org/pdf/1910.13664v1.pdf | |
PWC | https://paperswithcode.com/paper/phenotyping-of-clinical-notes-with-improved |
Repo | https://github.com/AndriyMulyar/bert_document_classification |
Framework | pytorch |
Chemical-protein Interaction Extraction via Gaussian Probability Distribution and External Biomedical Knowledge
Title | Chemical-protein Interaction Extraction via Gaussian Probability Distribution and External Biomedical Knowledge |
Authors | Cong Sun, Zhihao Yang, Leilei Su, Lei Wang, Yin Zhang, Hongfei Lin, Jian Wang |
Abstract | The biomedical literature contains a wealth of chemical-protein interactions (CPIs). Automatically extracting CPIs described in biomedical literature is essential for drug discovery, precision medicine, as well as basic biomedical research. However, the existing methods do not consider the impact of overlapping relations on CPI extraction. This leads to the extraction of sentences with overlapping relations becoming the bottleneck of CPI extraction. In this paper, we propose a novel neural network-based approach to improve the CPI extraction performance of sentences with overlapping relations. Specifically, the approach first employs BERT to generate high-quality contextual representations of the title sequence, instance sequence, and knowledge sequence. Then, the Gaussian probability distribution is introduced to capture the local structure of the instance. Meanwhile, the attention mechanism is applied to fuse the title information and biomedical knowledge, respectively. Finally, the related representations are concatenated and fed into the softmax function to extract CPIs. We evaluate our proposed model on the CHEMPROT corpus. Our proposed model is superior in performance as compared with other state-of-the-art models. The experimental results show that the Gaussian probability distribution and external knowledge are complementary to each other. Integrating them can effectively improve the CPI extraction performance. Furthermore, the Gaussian probability distribution can significantly improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks. Data and code are available at https://github.com/CongSun-dlut/CPI_extraction. |
Tasks | Drug Discovery, Relation Extraction |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09487v1 |
https://arxiv.org/pdf/1911.09487v1.pdf | |
PWC | https://paperswithcode.com/paper/chemical-protein-interaction-extraction-via |
Repo | https://github.com/CongSun-dlut/CPI_extraction |
Framework | pytorch |
DABNet: Depth-wise Asymmetric Bottleneck for Real-time Semantic Segmentation
Title | DABNet: Depth-wise Asymmetric Bottleneck for Real-time Semantic Segmentation |
Authors | Gen Li, Inyoung Yun, Jonghyun Kim, Joongkyu Kim |
Abstract | As a pixel-level prediction task, semantic segmentation needs large computational cost with enormous parameters to obtain high performance. Recently, due to the increasing demand for autonomous systems and robots, it is significant to make a tradeoff between accuracy and inference speed. In this paper, we propose a novel Depthwise Asymmetric Bottleneck (DAB) module to address this dilemma, which efficiently adopts depth-wise asymmetric convolution and dilated convolution to build a bottleneck structure. Based on the DAB module, we design a Depth-wise Asymmetric Bottleneck Network (DABNet) especially for real-time semantic segmentation, which creates sufficient receptive field and densely utilizes the contextual information. Experiments on Cityscapes and CamVid datasets demonstrate that the proposed DABNet achieves a balance between speed and precision. Specifically, without any pretrained model and postprocessing, it achieves 70.1% Mean IoU on the Cityscapes test dataset with only 0.76 million parameters and a speed of 104 FPS on a single GTX 1080Ti card. |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11357v2 |
https://arxiv.org/pdf/1907.11357v2.pdf | |
PWC | https://paperswithcode.com/paper/dabnet-depth-wise-asymmetric-bottleneck-for |
Repo | https://github.com/Reagan1311/DABNet |
Framework | pytorch |
ASV: Accelerated Stereo Vision System
Title | ASV: Accelerated Stereo Vision System |
Authors | Yu Feng, Paul Whatmough, Yuhao Zhu |
Abstract | Estimating depth from stereo vision cameras, i.e., “depth from stereo”, is critical to emerging intelligent applications deployed in energy- and performance-constrained devices, such as augmented reality headsets and mobile autonomous robots. While existing stereo vision systems make trade-offs between accuracy, performance and energy-efficiency, we describe ASV, an accelerated stereo vision system that simultaneously improves both performance and energy-efficiency while achieving high accuracy. The key to ASV is to exploit unique characteristics inherent to stereo vision, and apply stereo-specific optimizations, both algorithmically and computationally. We make two contributions. Firstly, we propose a new stereo algorithm, invariant-based stereo matching (ISM), that achieves significant speedup while retaining high accuracy. The algorithm combines classic “hand-crafted” stereo algorithms with recent developments in Deep Neural Networks (DNNs), by leveraging the correspondence invariant unique to stereo vision systems. Secondly, we observe that the bottleneck of the ISM algorithm is the DNN inference, and in particular the deconvolution operations that introduce massive compute-inefficiencies. We propose a set of software optimizations that mitigate these inefficiencies. We show that with less than 0.5% hardware area overhead, these algorithmic and computational optimizations can be effectively integrated within a conventional DNN accelerator. Overall, ASV achieves 5x speedup and 85% energy saving with 0.02% accuracy loss compared to today DNN-based stereo vision systems. |
Tasks | Stereo Matching |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1911.07919v1 |
https://arxiv.org/pdf/1911.07919v1.pdf | |
PWC | https://paperswithcode.com/paper/asv-accelerated-stereo-vision-system |
Repo | https://github.com/horizon-research/ism-algorithm |
Framework | pytorch |
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models
Title | Few-Shot Adversarial Learning of Realistic Neural Talking Head Models |
Authors | Egor Zakharov, Aliaksandra Shysheya, Egor Burkov, Victor Lempitsky |
Abstract | Several recent works have shown how highly realistic human head images can be obtained by training convolutional neural networks to generate them. In order to create a personalized talking head model, these works require training on a large dataset of images of a single person. However, in many practical scenarios, such personalized talking head models need to be learned from a few image views of a person, potentially even a single image. Here, we present a system with such few-shot capability. It performs lengthy meta-learning on a large dataset of videos, and after that is able to frame few- and one-shot learning of neural talking head models of previously unseen people as adversarial training problems with high capacity generators and discriminators. Crucially, the system is able to initialize the parameters of both the generator and the discriminator in a person-specific way, so that training can be based on just a few images and done quickly, despite the need to tune tens of millions of parameters. We show that such an approach is able to learn highly realistic and personalized talking head models of new people and even portrait paintings. |
Tasks | Meta-Learning, One-Shot Learning, Talking Head Generation |
Published | 2019-05-20 |
URL | https://arxiv.org/abs/1905.08233v2 |
https://arxiv.org/pdf/1905.08233v2.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-adversarial-learning-of-realistic |
Repo | https://github.com/hsandmann/espm.ml.2019.1 |
Framework | none |
Exact sampling of determinantal point processes with sublinear time preprocessing
Title | Exact sampling of determinantal point processes with sublinear time preprocessing |
Authors | Michał Dereziński, Daniele Calandriello, Michal Valko |
Abstract | We study the complexity of sampling from a distribution over all index subsets of the set ${1,…,n}$ with the probability of a subset $S$ proportional to the determinant of the submatrix $\mathbf{L}_S$ of some $n\times n$ p.s.d. matrix $\mathbf{L}$, where $\mathbf{L}_S$ corresponds to the entries of $\mathbf{L}$ indexed by $S$. Known as a determinantal point process, this distribution is used in machine learning to induce diversity in subset selection. In practice, we often wish to sample multiple subsets $S$ with small expected size $k = E[S] \ll n$ from a very large matrix $\mathbf{L}$, so it is important to minimize the preprocessing cost of the procedure (performed once) as well as the sampling cost (performed repeatedly). For this purpose, we propose a new algorithm which, given access to $\mathbf{L}$, samples exactly from a determinantal point process while satisfying the following two properties: (1) its preprocessing cost is $n \cdot \text{poly}(k)$, i.e., sublinear in the size of $\mathbf{L}$, and (2) its sampling cost is $\text{poly}(k)$, i.e., independent of the size of $\mathbf{L}$. Prior to our results, state-of-the-art exact samplers required $O(n^3)$ preprocessing time and sampling time linear in $n$ or dependent on the spectral properties of $\mathbf{L}$. We also give a reduction which allows using our algorithm for exact sampling from cardinality constrained determinantal point processes with $n\cdot\text{poly}(k)$ time preprocessing. |
Tasks | Point Processes |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13476v2 |
https://arxiv.org/pdf/1905.13476v2.pdf | |
PWC | https://paperswithcode.com/paper/exact-sampling-of-determinantal-point-1 |
Repo | https://github.com/guilgautier/DPPy |
Framework | none |
The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations
Title | The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations |
Authors | Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, Marcin Detyniecki |
Abstract | Post-hoc interpretability approaches have been proven to be powerful tools to generate explanations for the predictions made by a trained black-box model. However, they create the risk of having explanations that are a result of some artifacts learned by the model instead of actual knowledge from the data. This paper focuses on the case of counterfactual explanations and asks whether the generated instances can be justified, i.e. continuously connected to some ground-truth data. We evaluate the risk of generating unjustified counterfactual examples by investigating the local neighborhoods of instances whose predictions are to be explained and show that this risk is quite high for several datasets. Furthermore, we show that most state of the art approaches do not differentiate justified from unjustified counterfactual examples, leading to less useful explanations. |
Tasks | |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09294v1 |
https://arxiv.org/pdf/1907.09294v1.pdf | |
PWC | https://paperswithcode.com/paper/the-dangers-of-post-hoc-interpretability |
Repo | https://github.com/thibaultlaugel/truce |
Framework | none |
PyOD: A Python Toolbox for Scalable Outlier Detection
Title | PyOD: A Python Toolbox for Scalable Outlier Detection |
Authors | Yue Zhao, Zain Nasrullah, Zheng Li |
Abstract | PyOD is an open-source Python toolbox for performing scalable outlier detection on multivariate data. Uniquely, it provides access to a wide range of outlier detection algorithms, including established outlier ensembles and more recent neural network-based approaches, under a single, well-documented API designed for use by both practitioners and researchers. With robustness and scalability in mind, best practices such as unit testing, continuous integration, code coverage, maintainability checks, interactive examples and parallelization are emphasized as core components in the toolbox’s development. PyOD is compatible with both Python 2 and 3 and can be installed through Python Package Index (PyPI) or https://github.com/yzhao062/pyod. |
Tasks | Anomaly Detection, Outlier Detection, outlier ensembles |
Published | 2019-01-06 |
URL | https://arxiv.org/abs/1901.01588v2 |
https://arxiv.org/pdf/1901.01588v2.pdf | |
PWC | https://paperswithcode.com/paper/pyod-a-python-toolbox-for-scalable-outlier |
Repo | https://github.com/winstonll/SynC |
Framework | none |
Multi-objective training of Generative Adversarial Networks with multiple discriminators
Title | Multi-objective training of Generative Adversarial Networks with multiple discriminators |
Authors | Isabela Albuquerque, João Monteiro, Thang Doan, Breandan Considine, Tiago Falk, Ioannis Mitliagkas |
Abstract | Recent literature has demonstrated promising results for training Generative Adversarial Networks by employing a set of discriminators, in contrast to the traditional game involving one generator against a single adversary. Such methods perform single-objective optimization on some simple consolidation of the losses, e.g. an arithmetic average. In this work, we revisit the multiple-discriminator setting by framing the simultaneous minimization of losses provided by different models as a multi-objective optimization problem. Specifically, we evaluate the performance of multiple gradient descent and the hypervolume maximization algorithm on a number of different datasets. Moreover, we argue that the previously proposed methods and hypervolume maximization can all be seen as variations of multiple gradient descent in which the update direction can be computed efficiently. Our results indicate that hypervolume maximization presents a better compromise between sample quality and computational cost than previous methods. |
Tasks | |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08680v1 |
http://arxiv.org/pdf/1901.08680v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-objective-training-of-generative |
Repo | https://github.com/joaomonteirof/hGAN |
Framework | pytorch |
Aligning Vector-spaces with Noisy Supervised Lexicons
Title | Aligning Vector-spaces with Noisy Supervised Lexicons |
Authors | Noa Yehezkel Lubin, Jacob Goldberger, Yoav Goldberg |
Abstract | The problem of learning to translate between two vector spaces given a set of aligned points arises in several application areas of NLP. Current solutions assume that the lexicon which defines the alignment pairs is noise-free. We consider the case where the set of aligned points is allowed to contain an amount of noise, in the form of incorrect lexicon pairs and show that this arises in practice by analyzing the edited dictionaries after the cleaning process. We demonstrate that such noise substantially degrades the accuracy of the learned translation when using current methods. We propose a model that accounts for noisy pairs. This is achieved by introducing a generative model with a compatible iterative EM algorithm. The algorithm jointly learns the noise level in the lexicon, finds the set of noisy pairs, and learns the mapping between the spaces. We demonstrate the effectiveness of our proposed algorithm on two alignment problems: bilingual word embedding translation, and mapping between diachronic embedding spaces for recovering the semantic shifts of words across time periods. |
Tasks | |
Published | 2019-03-25 |
URL | http://arxiv.org/abs/1903.10238v1 |
http://arxiv.org/pdf/1903.10238v1.pdf | |
PWC | https://paperswithcode.com/paper/aligning-vector-spaces-with-noisy-supervised |
Repo | https://github.com/NoaKel/Noise-Aware-Alignment |
Framework | none |