January 25, 2020

3253 words 16 mins read

Paper Group ANR 1721

Paper Group ANR 1721

Spectrally-truncated kernel ridge regression and its free lunch. High-frequency crowd insights for public safety and congestion control. Improving Sentiment Analysis with Multi-task Learning of Negation. Leveraging Decentralized Artificial Intelligence to Enhance Resilience of Energy Networks. FDDWNet: A Lightweight Convolutional Neural Network for …

Spectrally-truncated kernel ridge regression and its free lunch

Title Spectrally-truncated kernel ridge regression and its free lunch
Authors Arash A. Amini
Abstract Kernel ridge regression (KRR) is a well-known and popular nonparametric regression approach with many desirable properties, including minimax rate-optimality in estimating functions that belong to common reproducing kernel Hilbert spaces (RKHS). The approach, however, is computationally intensive for large data sets, due to the need to operate on a dense $n \times n$ kernel matrix, where $n$ is the sample size. Recently, various approximation schemes for solving KRR have been considered, and some analyzed. Some approaches such as Nystr"{o}m approximation and sketching have been shown to preserve the rate optimality of KRR. In this paper, we consider the simplest approximation, namely, spectrally truncating the kernel matrix to its largest $r < n$ eigenvalues. We derive an exact expression for the maximum risk of this truncated KRR, over the unit ball of the RKHS. This result can be used to study the exact trade-off between the level of spectral truncation and the regularization parameter. We show that, as long as the RKHS is infinite-dimensional, there is a threshold on $r$, above which, the spectrally-truncated KRR surprisingly outperforms the full KRR in terms of the minimax risk, where the minimum is taken over the regularization parameter. This strengthens the existing results on approximation schemes, by showing that not only one does not lose in terms of the rates, truncation can in fact improve the performance, for all finite samples (above the threshold). Moreover, we show that the implicit regularization achieved by spectral truncation is not a substitute for Hilbert norm regularization. Both are needed to achieve the best performance.
Tasks
Published 2019-06-14
URL https://arxiv.org/abs/1906.06276v2
PDF https://arxiv.org/pdf/1906.06276v2.pdf
PWC https://paperswithcode.com/paper/spectrally-truncated-kernel-ridge-regression
Repo
Framework

High-frequency crowd insights for public safety and congestion control

Title High-frequency crowd insights for public safety and congestion control
Authors Karthik Nandakumar, Sebastien Blandin, Laura Wynter
Abstract We present results from several projects aimed at enabling the real-time understanding of crowds and their behaviour in the built environment. We make use of CCTV video cameras that are ubiquitous throughout the developed and developing world and as such are able to play the role of a reliable sensing mechanism. We outline the novel methods developed for our crowd insights engine, and illustrate examples of its use in different contexts in the urban landscape. Applications of the technology range from maintaining security in public spaces to quantifying the adequacy of public transport level of service.
Tasks
Published 2019-04-23
URL http://arxiv.org/abs/1904.10180v1
PDF http://arxiv.org/pdf/1904.10180v1.pdf
PWC https://paperswithcode.com/paper/high-frequency-crowd-insights-for-public
Repo
Framework

Improving Sentiment Analysis with Multi-task Learning of Negation

Title Improving Sentiment Analysis with Multi-task Learning of Negation
Authors Jeremy Barnes, Erik Velldal, Lilja Øvrelid
Abstract Sentiment analysis is directly affected by compositional phenomena in language that act on the prior polarity of the words and phrases found in the text. Negation is the most prevalent of these phenomena and in order to correctly predict sentiment, a classifier must be able to identify negation and disentangle the effect that its scope has on the final polarity of a text. This paper proposes a multi-task approach to explicitly incorporate information about negation in sentiment analysis, which we show outperforms learning negation implicitly in a data-driven manner. We describe our approach, a cascading neural architecture with selective sharing of LSTM layers, and show that explicitly training the model with negation as an auxiliary task helps improve the main task of sentiment analysis. The effect is demonstrated across several different standard English-language data sets for both tasks and we analyze several aspects of our system related to its performance, varying types and amounts of input data and different multi-task setups.
Tasks Multi-Task Learning, Sentiment Analysis
Published 2019-06-18
URL https://arxiv.org/abs/1906.07610v2
PDF https://arxiv.org/pdf/1906.07610v2.pdf
PWC https://paperswithcode.com/paper/improving-sentiment-analysis-with-multi-task
Repo
Framework

Leveraging Decentralized Artificial Intelligence to Enhance Resilience of Energy Networks

Title Leveraging Decentralized Artificial Intelligence to Enhance Resilience of Energy Networks
Authors Ahmed Imteaj, M. Hadi Amini, Javad Mohammadi
Abstract This paper reintroduces the notion of resilience in the context of recent issues originated from climate change triggered events including severe hurricanes and wildfires. A recent example is PG&E’s forced power outage to contain wildfire risk which led to widespread power disruption. This paper focuses on answering two questions: who is responsible for resilience? and how to quantify the monetary value of resilience? To this end, we first provide preliminary definitions of resilience for power systems. We then investigate the role of natural hazards, especially wildfire, on power system resilience. Finally, we will propose a decentralized strategy for a resilient management system using distributed storage and demand response resources. Our proposed high fidelity model provides utilities, operators, and policymakers with a clearer picture for strategic decision making and preventive decisions.
Tasks Decision Making
Published 2019-11-18
URL https://arxiv.org/abs/1911.07690v1
PDF https://arxiv.org/pdf/1911.07690v1.pdf
PWC https://paperswithcode.com/paper/leveraging-decentralized-artificial
Repo
Framework

FDDWNet: A Lightweight Convolutional Neural Network for Real-time Sementic Segmentation

Title FDDWNet: A Lightweight Convolutional Neural Network for Real-time Sementic Segmentation
Authors Jia Liu, Quan Zhou, Yong Qiang, Bin Kang, Xiaofu Wu, Baoyu Zheng
Abstract This paper introduces a lightweight convolutional neural network, called FDDWNet, for real-time accurate semantic segmentation. In contrast to recent advances of lightweight networks that prefer to utilize shallow structure, FDDWNet makes an effort to design more deeper network architecture, while maintains faster inference speed and higher segmentation accuracy. Our network uses factorized dilated depth-wise separable convolutions (FDDWC) to learn feature representations from different scale receptive fields with fewer model parameters. Additionally, FDDWNet has multiple branches of skipped connections to gather context cues from intermediate convolution layers. The experiments show that FDDWNet only has 0.8M model size, while achieves 60 FPS running speed on a single RTX 2080Ti GPU with a 1024x512 input image. The comprehensive experiments demonstrate that our model achieves state-of-the-art results in terms of available speed and accuracy trade-off on CityScapes and CamVid datasets.
Tasks Semantic Segmentation
Published 2019-11-02
URL https://arxiv.org/abs/1911.00632v2
PDF https://arxiv.org/pdf/1911.00632v2.pdf
PWC https://paperswithcode.com/paper/fddwnet-a-lightweight-convolutional-neural
Repo
Framework

Removing Rain in Videos: A Large-scale Database and A Two-stream ConvLSTM Approach

Title Removing Rain in Videos: A Large-scale Database and A Two-stream ConvLSTM Approach
Authors Tie Liu, Mai Xu, Zulin Wang
Abstract Rain removal has recently attracted increasing research attention, as it is able to enhance the visibility of rain videos. However, the existing learning based rain removal approaches for videos suffer from insufficient training data, especially when applying deep learning to remove rain. In this paper, we establish a large-scale video database for rain removal (LasVR), which consists of 316 rain videos. Then, we observe from our database that there exist the temporal correlation of clean content and similar patterns of rain across video frames. According to these two observations, we propose a two-stream convolutional long- and short- term memory (ConvLSTM) approach for rain removal in videos. The first stream is composed of the subnet for rain detection, while the second stream is the subnet of rain removal that leverages the features from the rain detection subnet. Finally, the experimental results on both synthetic and real rain videos show the proposed approach performs better than other state-of-the-art approaches.
Tasks Rain Removal
Published 2019-06-06
URL https://arxiv.org/abs/1906.02526v1
PDF https://arxiv.org/pdf/1906.02526v1.pdf
PWC https://paperswithcode.com/paper/removing-rain-in-videos-a-large-scale
Repo
Framework

MIDI-Sandwich2: RNN-based Hierarchical Multi-modal Fusion Generation VAE networks for multi-track symbolic music generation

Title MIDI-Sandwich2: RNN-based Hierarchical Multi-modal Fusion Generation VAE networks for multi-track symbolic music generation
Authors Xia Liang, Junmin Wu, Jing Cao
Abstract Currently, almost all the multi-track music generation models use the Convolutional Neural Network (CNN) to build the generative model, while the Recurrent Neural Network (RNN) based models can not be applied in this task. In view of the above problem, this paper proposes a RNN-based Hierarchical Multi-modal Fusion Generation Variational Autoencoder (VAE) network, MIDI-Sandwich2, for multi-track symbolic music generation. Inspired by VQ-VAE2, MIDI-Sandwich2 expands the dimension of the original hierarchical model by using multiple independent Binary Variational Autoencoder (BVAE) models without sharing weights to process the information of each track. Then, with multi-modal fusion technology, the upper layer named Multi-modal Fusion Generation VAE (MFG-VAE) combines the latent space vectors generated by the respective tracks, and uses the decoder to perform the ascending dimension reconstruction to simulate the inverse operation of multi-modal fusion, multi-modal generation, so as to realize the RNN-based multi-track symbolic music generation. For the multi-track format pianoroll, we also improve the output binarization method of MuseGAN, which solves the problem that the refinement step of the original scheme is difficult to differentiate and the gradient is hard to descent, making the generated song more expressive. The model is validated on the Lakh Pianoroll Dataset (LPD) multi-track dataset. Compared to the MuseGAN, MIDI-Sandwich2 can not only generate harmonious multi-track music, the generation quality is also close to the state of the art level. At the same time, by using the VAE to restore songs, the semi-generated songs reproduced by the MIDI-Sandwich2 are more beautiful than the pure autogeneration music generated by MuseGAN. Both the code and the audition audio samples are open source on https://github.com/LiangHsia/MIDI-S2.
Tasks Music Generation
Published 2019-09-08
URL https://arxiv.org/abs/1909.03522v1
PDF https://arxiv.org/pdf/1909.03522v1.pdf
PWC https://paperswithcode.com/paper/midi-sandwich2-rnn-based-hierarchical-multi
Repo
Framework

The CL-SciSumm Shared Task 2018: Results and Key Insights

Title The CL-SciSumm Shared Task 2018: Results and Key Insights
Authors Kokil Jaidka, Michihiro Yasunaga, Muthu Kumar Chandrasekaran, Dragomir Radev, Min-Yen Kan
Abstract This overview describes the official results of the CL-SciSumm Shared Task 2018 – the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain. This year, the dataset comprised 60 annotated sets of citing and reference papers from the open access research papers in the CL domain. The Shared Task was organized as a part of the 41st Annual Conference of the Special Interest Group in Information Retrieval (SIGIR), held in Ann Arbor, USA in July 2018. We compare the participating systems in terms of two evaluation metrics. The annotated dataset and evaluation scripts can be accessed and used by the community from: \url{https://github.com/WING-NUS/scisumm-corpus}.
Tasks Document Summarization, Information Retrieval
Published 2019-09-02
URL https://arxiv.org/abs/1909.00764v1
PDF https://arxiv.org/pdf/1909.00764v1.pdf
PWC https://paperswithcode.com/paper/the-cl-scisumm-shared-task-2018-results-and
Repo
Framework

Transferable Knowledge for Low-cost Decision Making in Cloud Environments

Title Transferable Knowledge for Low-cost Decision Making in Cloud Environments
Authors Faiza Samreen, Gordon S Blair, Yehia Elkhatib
Abstract Users of cloud computing are increasingly overwhelmed with the wide range of providers and services offered by each provider. As such, many users select cloud services based on description alone. An emerging alternative is to use a decision support system (DSS), which typically relies on gaining insights from observational data in order to assist a customer in making decisions regarding optimal deployment or redeployment of cloud applications. The primary activity of such systems is the generation of a prediction model (e.g. using machine learning), which requires a significantly large amount of training data. However, considering the varying architectures of applications, cloud providers, and cloud offerings, this activity is not sustainable as it incurs additional time and cost to collect training data and subsequently train the models. We overcome this through developing a Transfer Learning (TL) approach where the knowledge (in the form of the prediction model and associated data set) gained from running an application on a particular cloud infrastructure is transferred in order to substantially reduce the overhead of building new models for the performance of new applications and/or cloud infrastructures. In this paper, we present our approach and evaluate it through extensive experimentation involving three real world applications over two major public cloud providers, namely Amazon and Google. Our evaluation shows that our novel two-mode TL scheme increases overall efficiency with a factor of 60% reduction in the time and cost of generating a new prediction model. We test this under a number of cross-application and cross-cloud scenarios.
Tasks Decision Making, Transfer Learning
Published 2019-05-07
URL https://arxiv.org/abs/1905.02448v1
PDF https://arxiv.org/pdf/1905.02448v1.pdf
PWC https://paperswithcode.com/paper/transferable-knowledge-for-low-cost-decision
Repo
Framework

DeepAISE – An End-to-End Development and Deployment of a Recurrent Neural Survival Model for Early Prediction of Sepsis

Title DeepAISE – An End-to-End Development and Deployment of a Recurrent Neural Survival Model for Early Prediction of Sepsis
Authors Supreeth P. Shashikumar, Christopher Josef, Ashish Sharma, Shamim Nemati
Abstract Sepsis, a dysregulated immune system response to infection, is among the leading causes of morbidity, mortality, and cost overruns in the Intensive Care Unit (ICU). Early prediction of sepsis can improve situational awareness amongst clinicians and facilitate timely, protective interventions. While the application of predictive analytics in ICU patients has shown early promising results, much of the work has been encumbered by high false-alarm rates. Efforts to improve specificity have been limited by several factors, most notably the difficulty of labeling sepsis onset time and the low prevalence of septic-events in the ICU. Here, we present DeepAISE (Deep Artificial Intelligence Sepsis Expert), a recurrent neural survival model for the early prediction of sepsis. We show that by coupling a clinical criterion for defining sepsis onset time with a treatment policy (e.g., initiation of antibiotics within one hour of meeting the criterion), one may rank the relative utility of various criteria through offline policy evaluation. Given the optimal criterion, DeepAISE automatically learns predictive features related to higher-order interactions and temporal patterns among clinical risk factors that maximize the data likelihood of observed time to septic events. DeepAISE has been incorporated into a clinical workflow, which provides real-time hourly sepsis risk scores. A comparative study of four baseline models indicates that DeepAISE produces the most accurate predictions (AUC=0.90 and 0.87) and the lowest false alarm rates (FAR=0.20 and 0.26) in two separate cohorts (internal and external, respectively), while simultaneously producing interpretable representations of the clinical time series and risk factors.
Tasks Time Series
Published 2019-08-10
URL https://arxiv.org/abs/1908.04759v1
PDF https://arxiv.org/pdf/1908.04759v1.pdf
PWC https://paperswithcode.com/paper/deepaise-an-end-to-end-development-and
Repo
Framework

Certain Answers to a SPARQL Query over a Knowledge Base (extended version)

Title Certain Answers to a SPARQL Query over a Knowledge Base (extended version)
Authors Julien Corman, Guohui Xiao
Abstract Ontology-Mediated Query Answering (OMQA) is a well-established framework to answer queries over an RDFS or OWL Knowledge Base (KB). OMQA was originally designed for unions of conjunctive queries (UCQs), and based on certain answers. More recently, OMQA has been extended to SPARQL queries, but to our knowledge, none of the efforts made in this direction (either in the literature, or the so-called SPARQL entailment regimes) is able to capture both certain answers for UCQs and the standard interpretation of SPARQL over a plain graph. We formalize these as requirements to be met by any semantics aiming at conciliating certain answers and SPARQL answers, and define three additional requirements, which generalize to KBs some basic properties of SPARQL answers. Then we show that a semantics can be defined that satisfies all requirements for SPARQL queries with SELECT, UNION, and OPTIONAL, and for DLs with the canonical model property. We also investigate combined complexity for query answering under such a semantics over DL-Lite R KBs. In particular, we show for different fragments of SPARQL that known upper-bounds for query answering over a plain graph are matched.
Tasks
Published 2019-11-06
URL https://arxiv.org/abs/1911.02668v3
PDF https://arxiv.org/pdf/1911.02668v3.pdf
PWC https://paperswithcode.com/paper/certain-answers-to-a-sparql-query-over-a
Repo
Framework

Searching for an (un)stable equilibrium: experiments in training generative models without data

Title Searching for an (un)stable equilibrium: experiments in training generative models without data
Authors Terence Broad, Mick Grierson
Abstract This paper details a developing artistic practice around an ongoing series of works called (un)stable equilibrium. These works are the product of using modern machine toolkits to train generative models without data, an approach akin to traditional generative art where dynamical systems are explored intuitively for their latent generative possibilities. We discuss some of the guiding principles that have been learnt in the process of experimentation, present details of the implementation of the first series of works and discuss possibilities for future experimentation.
Tasks
Published 2019-10-06
URL https://arxiv.org/abs/1910.02409v1
PDF https://arxiv.org/pdf/1910.02409v1.pdf
PWC https://paperswithcode.com/paper/searching-for-an-unstable-equilibrium
Repo
Framework

Unifying machine learning and quantum chemistry – a deep neural network for molecular wavefunctions

Title Unifying machine learning and quantum chemistry – a deep neural network for molecular wavefunctions
Authors K. T. Schütt, M. Gastegger, A. Tkatchenko, K. -R. Müller, R. J. Maurer
Abstract Machine learning advances chemistry and materials science by enabling large-scale exploration of chemical space based on quantum chemical calculations. While these models supply fast and accurate predictions of atomistic chemical properties, they do not explicitly capture the electronic degrees of freedom of a molecule, which limits their applicability for reactive chemistry and chemical analysis. Here we present a deep learning framework for the prediction of the quantum mechanical wavefunction in a local basis of atomic orbitals from which all other ground-state properties can be derived. This approach retains full access to the electronic structure via the wavefunction at force field-like efficiency and captures quantum mechanics in an analytically differentiable representation. On several examples, we demonstrate that this opens promising avenues to perform inverse design of molecular structures for target electronic property optimisation and a clear path towards increased synergy of machine learning and quantum chemistry.
Tasks
Published 2019-06-24
URL https://arxiv.org/abs/1906.10033v1
PDF https://arxiv.org/pdf/1906.10033v1.pdf
PWC https://paperswithcode.com/paper/unifying-machine-learning-and-quantum
Repo
Framework

Compliance Change Tracking in Business Process Services

Title Compliance Change Tracking in Business Process Services
Authors Srikanth G Tamilselvam, Ankush Gupta, Arvind Agarwal
Abstract Regulatory compliance is an organization’s adherence to laws, regulations, guidelines and specifications relevant to its business. Compliance officers responsible for maintaining adherence constantly struggle to keep up with the large amount of changes in regulatory requirements. Keeping up with the changes entail two main tasks: fetching the regulatory announcements that actually contain changes of interest, and incorporating those changes in the business process. In this paper we focus on the first task, and present a Compliance Change Tracking System, that gathers regulatory announcements from government sites, news sites, email subscriptions; classifies their importance i.e Actionability through a hierarchical classifier, and business process applicability through a multi-class classifier. For these classifiers, we experiment with several approaches such as vanilla classification methods (e.g. Naive Bayes, logistic regression etc.), hierarchical classification methods, rule based approach, hybrid approach with various preprocessing and feature selection methods; and show that despite the richness of other models, a simple hierarchical classification with bag-of-words features works the best for Actionability classifier and multi-class logistic regression works the best for Applicability classifier. The system has been deployed in global delivery centers, and has received positive feedback from payroll compliance officers.
Tasks Feature Selection
Published 2019-08-20
URL https://arxiv.org/abs/1908.07190v1
PDF https://arxiv.org/pdf/1908.07190v1.pdf
PWC https://paperswithcode.com/paper/compliance-change-tracking-in-business
Repo
Framework

Deep network as memory space: complexity, generalization, disentangled representation and interpretability

Title Deep network as memory space: complexity, generalization, disentangled representation and interpretability
Authors X. Dong, L. Zhou
Abstract By bridging deep networks and physics, the programme of geometrization of deep networks was proposed as a framework for the interpretability of deep learning systems. Following this programme we can apply two key ideas of physics, the geometrization of physics and the least action principle, on deep networks and deliver a new picture of deep networks: deep networks as memory space of information, where the capacity, robustness and efficiency of the memory are closely related with the complexity, generalization and disentanglement of deep networks. The key components of this understanding include:(1) a Fisher metric based formulation of the network complexity; (2)the least action (complexity=action) principle on deep networks and (3)the geometry built on deep network configurations. We will show how this picture will bring us a new understanding of the interpretability of deep learning systems.
Tasks
Published 2019-07-12
URL https://arxiv.org/abs/1907.06572v1
PDF https://arxiv.org/pdf/1907.06572v1.pdf
PWC https://paperswithcode.com/paper/deep-network-as-memory-space-complexity
Repo
Framework
comments powered by Disqus