January 30, 2020

3173 words 15 mins read

Paper Group ANR 242

Tensor-Ring Nuclear Norm Minimization and Application for Visual Data Completion. Online News Media Website Ranking Using User Generated Content. Lifelong Bayesian Optimization. Leveraging Medical Visual Question Answering with Supporting Facts. Structure Learning for Neural Module Networks. Autoregressive Models: What Are They Good For?. Potential …

Tensor-Ring Nuclear Norm Minimization and Application for Visual Data Completion


Title	Tensor-Ring Nuclear Norm Minimization and Application for Visual Data Completion
Authors	Jinshi Yu, Chao Li, Qibin Zhao, Guoxu Zhou
Abstract	Tensor ring (TR) decomposition has been successfully used to obtain the state-of-the-art performance in the visual data completion problem. However, the existing TR-based completion methods are severely non-convex and computationally demanding. In addition, the determination of the optimal TR rank is a tough work in practice. To overcome these drawbacks, we first introduce a class of new tensor nuclear norms by using tensor circular unfolding. Then we theoretically establish connection between the rank of the circularly-unfolded matrices and the TR ranks. We also develop an efficient tensor completion algorithm by minimizing the proposed tensor nuclear norm. Extensive experimental results demonstrate that our proposed tensor completion method outperforms the conventional tensor completion methods in the image/video in-painting problem with striped missing values.
Tasks
Published	2019-03-21
URL	http://arxiv.org/abs/1903.08888v1
PDF	http://arxiv.org/pdf/1903.08888v1.pdf
PWC	https://paperswithcode.com/paper/tensor-ring-nuclear-norm-minimization-and
Repo
Framework

Online News Media Website Ranking Using User Generated Content


Title	Online News Media Website Ranking Using User Generated Content
Authors	Samaneh Karimi, Azadeh Shakery, Rakesh Verma
Abstract	News media websites are important online resources that have drawn great attention of text mining researchers. The main aim of this study is to propose a framework for ranking online news websites from different viewpoints. The ranking of news websites is useful information, which can benefit many news-related tasks such as news retrieval and news recommendation. In the proposed framework, the ranking of news websites is obtained by calculating three measures introduced in the paper and based on user-generated content. Each proposed measure is concerned with the performance of news websites from a particular viewpoint including the completeness of news reports, the diversity of events being covered by the website and its speed. The use of user-generated content in this framework, as a partly-unbiased, real-time and low cost content on the web distinguishes the proposed news website ranking framework from the literature. The results obtained for three prominent news websites, BBC, CNN, NYTimes, show that BBC has the best performance in terms of news completeness and speed, and NYTimes has the best diversity in comparison with the other two websites.
Tasks
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12441v1
PDF	https://arxiv.org/pdf/1910.12441v1.pdf
PWC	https://paperswithcode.com/paper/online-news-media-website-ranking-using-user
Repo
Framework

Lifelong Bayesian Optimization


Title	Lifelong Bayesian Optimization
Authors	Yao Zhang, James Jordon, Ahmed M. Alaa, Mihaela van der Schaar
Abstract	Automatic Machine Learning (Auto-ML) systems tackle the problem of automating the design of prediction models or pipelines for data science. In this paper, we present Lifelong Bayesian Optimization (LBO), an online, multitask Bayesian optimization (BO) algorithm designed to solve the problem of model selection for datasets arriving and evolving over time. To be suitable for “lifelong” Bayesian Optimization, an algorithm needs to scale with the ever increasing number of acquisitions and should be able to leverage past optimizations in learning the current best model. We cast the problem of model selection as a black-box function optimization problem. In LBO, we exploit the correlation between functions by using components of previously learned functions to speed up the learning process for newly arriving datasets. Experiments on real and synthetic data show that LBO outperforms standard BO algorithms applied repeatedly on the data.
Tasks	Model Selection
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12280v2
PDF	https://arxiv.org/pdf/1905.12280v2.pdf
PWC	https://paperswithcode.com/paper/lifelong-bayesian-optimization
Repo
Framework

Leveraging Medical Visual Question Answering with Supporting Facts


Title	Leveraging Medical Visual Question Answering with Supporting Facts
Authors	Tomasz Kornuta, Deepta Rajan, Chaitanya Shivade, Alexis Asseman, Ahmet S. Ozcan
Abstract	In this working notes paper, we describe IBM Research AI (Almaden) team’s participation in the ImageCLEF 2019 VQA-Med competition. The challenge consists of four question-answering tasks based on radiology images. The diversity of imaging modalities, organs and disease types combined with a small imbalanced training set made this a highly complex problem. To overcome these difficulties, we implemented a modular pipeline architecture that utilized transfer learning and multi-task learning. Our findings led to the development of a novel model called Supporting Facts Network (SFN). The main idea behind SFN is to cross-utilize information from upstream tasks to improve the accuracy on harder downstream ones. This approach significantly improved the scores achieved in the validation set (18 point improvement in F-1 score). Finally, we submitted four runs to the competition and were ranked seventh.
Tasks	Multi-Task Learning, Question Answering, Transfer Learning, Visual Question Answering
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12008v1
PDF	https://arxiv.org/pdf/1905.12008v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-medical-visual-question-answering
Repo
Framework

Structure Learning for Neural Module Networks


Title	Structure Learning for Neural Module Networks
Authors	Vardaan Pahuja, Jie Fu, Sarath Chandar, Christopher J. Pal
Abstract	Neural Module Networks, originally proposed for the task of visual question answering, are a class of neural network architectures that involve human-specified neural modules, each designed for a specific form of reasoning. In current formulations of such networks only the parameters of the neural modules and/or the order of their execution is learned. In this work, we further expand this approach and also learn the underlying internal structure of modules in terms of the ordering and combination of simple and elementary arithmetic operators. Our results show that one is indeed able to simultaneously learn both internal module structure and module sequencing without extra supervisory signals for module execution sequencing. With this approach, we report performance comparable to models using hand-designed modules.
Tasks	Question Answering, Visual Question Answering
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11532v1
PDF	https://arxiv.org/pdf/1905.11532v1.pdf
PWC	https://paperswithcode.com/paper/structure-learning-for-neural-module-networks
Repo
Framework

Autoregressive Models: What Are They Good For?


Title	Autoregressive Models: What Are They Good For?
Authors	Murtaza Dalal, Alexander C. Li, Rohan Taori
Abstract	Autoregressive (AR) models have become a popular tool for unsupervised learning, achieving state-of-the-art log likelihood estimates. We investigate the use of AR models as density estimators in two settings – as a learning signal for image translation, and as an outlier detector – and find that these density estimates are much less reliable than previously thought. We examine the underlying optimization issues from both an empirical and theoretical perspective, and provide a toy example that illustrates the problem. Overwhelmingly, we find that density estimates do not correlate with perceptual quality and are unhelpful for downstream tasks.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07737v1
PDF	https://arxiv.org/pdf/1910.07737v1.pdf
PWC	https://paperswithcode.com/paper/autoregressive-models-what-are-they-good-for
Repo
Framework

Potential Passenger Flow Prediction: A Novel Study for Urban Transportation Development


Title	Potential Passenger Flow Prediction: A Novel Study for Urban Transportation Development
Authors	Yongshun Gong, Zhibin Li, Jian Zhang, Wei Liu, Jinfeng Yi
Abstract	Recently, practical applications for passenger flow prediction have brought many benefits to urban transportation development. With the development of urbanization, a real-world demand from transportation managers is to construct a new metro station in one city area that never planned before. Authorities are interested in the picture of the future volume of commuters before constructing a new station, and estimate how would it affect other areas. In this paper, this specific problem is termed as potential passenger flow (PPF) prediction, which is a novel and important study connected with urban computing and intelligent transportation systems. For example, an accurate PPF predictor can provide invaluable knowledge to designers, such as the advice of station scales and influences on other areas, etc. To address this problem, we propose a multi-view localized correlation learning method. The core idea of our strategy is to learn the passenger flow correlations between the target areas and their localized areas with adaptive-weight. To improve the prediction accuracy, other domain knowledge is involved via a multi-view learning process. We conduct intensive experiments to evaluate the effectiveness of our method with real-world official transportation datasets. The results demonstrate that our method can achieve excellent performance compared with other available baselines. Besides, our method can provide an effective solution to the cold-start problem in the recommender system as well, which proved by its outperformed experimental results.
Tasks	MULTI-VIEW LEARNING, Recommendation Systems
Published	2019-12-07
URL	https://arxiv.org/abs/1912.03440v1
PDF	https://arxiv.org/pdf/1912.03440v1.pdf
PWC	https://paperswithcode.com/paper/potential-passenger-flow-prediction-a-novel
Repo
Framework

Personal VAD: Speaker-Conditioned Voice Activity Detection


Title	Personal VAD: Speaker-Conditioned Voice Activity Detection
Authors	Shaojin Ding, Quan Wang, Shuo-yiin Chang, Li Wan, Ignacio Lopez Moreno
Abstract	In this paper, we propose “personal VAD”, a system to detect the voice activity of a target speaker at the frame level. This system is useful for gating the inputs to a streaming on-device speech recognition system, such that it only triggers for the target user, which helps reduce the computational cost and battery consumption. We achieve this by training a VAD-alike neural network that is conditioned on the target speaker embedding or the speaker verification score. For each frame, personal VAD outputs the probabilities for three classes: non-speech, target speaker speech, and non-target speaker speech. Under our optimal setup, we are able to train a model with 130K parameters that outperforms a baseline system where individually trained standard VAD and speaker recognition networks are combined to perform the same task.
Tasks	Action Detection, Activity Detection, Speaker Recognition, Speaker Verification, Speech Recognition
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04284v3
PDF	https://arxiv.org/pdf/1908.04284v3.pdf
PWC	https://paperswithcode.com/paper/personal-vad-speaker-conditioned-voice
Repo
Framework

Affine Invariant Covariance Estimation for Heavy-Tailed Distributions


Title	Affine Invariant Covariance Estimation for Heavy-Tailed Distributions
Authors	Dmitrii Ostrovskii, Alessandro Rudi
Abstract	In this work we provide an estimator for the covariance matrix of a heavy-tailed multivariate distributionWe prove that the proposed estimator $\widehat{\mathbf{S}}$ admits an \textit{affine-invariant} bound of the form [(1-\varepsilon) \mathbf{S} \preccurlyeq \widehat{\mathbf{S}} \preccurlyeq (1+\varepsilon) \mathbf{S}]in high probability, where $\mathbf{S}$ is the unknown covariance matrix, and $\preccurlyeq$ is the positive semidefinite order on symmetric matrices. The result only requires the existence of fourth-order moments, and allows for $\varepsilon = O(\sqrt{\kappa^4 d\log(d/\delta)/n})$ where $\kappa^4$ is a measure of kurtosis of the distribution, $d$ is the dimensionality of the space, $n$ is the sample size, and $1-\delta$ is the desired confidence level. More generally, we can allow for regularization with level $\lambda$, then $d$ gets replaced with the degrees of freedom number. Denoting $\text{cond}(\mathbf{S})$ the condition number of $\mathbf{S}$, the computational cost of the novel estimator is $O(d^2 n + d^3\log(\text{cond}(\mathbf{S})))$, which is comparable to the cost of the sample covariance estimator in the statistically interesing regime $n \ge d$. We consider applications of our estimator to eigenvalue estimation with relative error, and to ridge regression with heavy-tailed random design.
Tasks
Published	2019-02-08
URL	https://arxiv.org/abs/1902.03086v2
PDF	https://arxiv.org/pdf/1902.03086v2.pdf
PWC	https://paperswithcode.com/paper/affine-invariant-covariance-estimation-for
Repo
Framework

Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys with Open-ended Questions


Title	Tell Me About Yourself: Using an AI-Powered Chatbot to Conduct Conversational Surveys with Open-ended Questions
Authors	Ziang Xiao, Michelle X. Zhou, Q. Vera Liao, Gloria Mark, Changyan Chi, Wenxi Chen, Huahai Yang
Abstract	The rise of increasingly more powerful chatbots offers a new way to collect information through conversational surveys, where a chatbot asks open-ended questions, interprets a user’s free-text responses, and probes answers whenever needed. To investigate the effectiveness and limitations of such a chatbot in conducting surveys, we conducted a field study involving about 600 participants. In this study with mostly open-ended questions, half of the participants took a typical online survey on Qualtrics and the other half interacted with an AI-powered chatbot to complete a conversational survey. Our detailed analysis of over 5200 free-text responses revealed that the chatbot drove a significantly higher level of participant engagement and elicited significantly better quality responses measured by Gricean Maxims in terms of their informativeness, relevance, specificity, and clarity. Based on our results, we discuss design implications for creating AI-powered chatbots to conduct effective surveys and beyond.
Tasks	Chatbot
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10700v2
PDF	https://arxiv.org/pdf/1905.10700v2.pdf
PWC	https://paperswithcode.com/paper/tell-me-about-yourself-using-an-ai-powered
Repo
Framework

Machine Learning Systems for Highly-Distributed and Rapidly-Growing Data


Title	Machine Learning Systems for Highly-Distributed and Rapidly-Growing Data
Authors	Kevin Hsieh
Abstract	The usability and practicality of any machine learning (ML) applications are largely influenced by two critical but hard-to-attain factors: low latency and low cost. Unfortunately, achieving low latency and low cost is very challenging when ML depends on real-world data that are highly distributed and rapidly growing (e.g., data collected by mobile phones and video cameras all over the world). Such real-world data pose many challenges in communication and computation. For example, when training data are distributed across data centers that span multiple continents, communication among data centers can easily overwhelm the limited wide-area network bandwidth, leading to prohibitively high latency and high cost. In this dissertation, we demonstrate that the latency and cost of ML on highly-distributed and rapidly-growing data can be improved by one to two orders of magnitude by designing ML systems that exploit the characteristics of ML algorithms, ML model structures, and ML training/serving data. We support this thesis statement with three contributions. First, we design a system that provides both low-latency and low-cost ML serving (inferencing) over large-scale and continuously-growing datasets, such as videos. Second, we build a system that makes ML training over geo-distributed datasets as fast as training within a single data center. Third, we present a first detailed study and a system-level solution on a fundamental and largely overlooked problem: ML training over non-IID (i.e., not independent and identically distributed) data partitions (e.g., facial images collected by cameras varies according to the demographics of each camera’s location).
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08663v1
PDF	https://arxiv.org/pdf/1910.08663v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-systems-for-highly
Repo
Framework

ConTrOn: Continuously Trained Ontology based on Technical Data Sheets and Wikidata


Title	ConTrOn: Continuously Trained Ontology based on Technical Data Sheets and Wikidata
Authors	Kobkaew Opasjumruskit, Diana Peters, Sirko Schindler
Abstract	In engineering projects involving various parts from global suppliers, one common task is to determine which parts are best suited for the project requirements. Information about specific parts’ characteristics is published in so called data sheets. However, these data sheets are oftentimes only published in textual form, e.g., as a PDF. Hence, they have to be transformed into a machine-interpretable format. This transformation process still requires a lot of manual intervention and is prone to errors. Automated approaches make use of ontologies to capture the given domain and thus improve automated information extraction from the data sheets. However, ontologies rely solely on experiences and perspectives of their creators at the time of creation and cannot accumulate knowledge over time on their own. This paper presents ConTrOn – Continuously Trained Ontology – a system that automatically augments ontologies. ConTrOn tackles terminology problems by combining the knowledge extracted from data sheets with an ontology created by domain experts and external knowledge bases such as WordNet and Wikidata. To demonstrate how the enriched ontology can improve the information extraction process, we selected data sheets from spacecraft development as a use case. The evaluation results show that the amount of information extracted from data sheets based on ontologies is significantly increased after the ontology enrichment.
Tasks
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06752v1
PDF	https://arxiv.org/pdf/1906.06752v1.pdf
PWC	https://paperswithcode.com/paper/contron-continuously-trained-ontology-based
Repo
Framework

Multi-timescale Trajectory Prediction for Abnormal Human Activity Detection


Title	Multi-timescale Trajectory Prediction for Abnormal Human Activity Detection
Authors	Royston Rodrigues, Neha Bhargava, Rajbabu Velmurugan, Subhasis Chaudhuri
Abstract	A classical approach to abnormal activity detection is to learn a representation for normal activities from the training data and then use this learned representation to detect abnormal activities while testing. Typically, the methods based on this approach operate at a fixed timescale - either a single time-instant (eg. frame-based) or a constant time duration (eg. video-clip based). But human abnormal activities can take place at different timescales. For example, jumping is a short term anomaly and loitering is a long term anomaly in a surveillance scenario. A single and pre-defined timescale is not enough to capture the wide range of anomalies occurring with different time duration. In this paper, we propose a multi-timescale model to capture the temporal dynamics at different timescales. In particular, the proposed model makes future and past predictions at different timescales for a given input pose trajectory. The model is multi-layered where intermediate layers are responsible to generate predictions corresponding to different timescales. These predictions are combined to detect abnormal activities. In addition, we also introduce an abnormal activity data-set for research use that contains 4,83,566 annotated frames. Data-set will be made available at https://rodrigues-royston.github.io/Multi-timescale_Trajectory_Prediction/ Our experiments show that the proposed model can capture the anomalies of different time duration and outperforms existing methods.
Tasks	Action Detection, Activity Detection, Trajectory Prediction
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04321v1
PDF	https://arxiv.org/pdf/1908.04321v1.pdf
PWC	https://paperswithcode.com/paper/multi-timescale-trajectory-prediction-for
Repo
Framework

Understanding artificial intelligence ethics and safety


Title	Understanding artificial intelligence ethics and safety
Authors	David Leslie
Abstract	A remarkable time of human promise has been ushered in by the convergence of the ever-expanding availability of big data, the soaring speed and stretch of cloud computing platforms, and the advancement of increasingly sophisticated machine learning algorithms. Innovations in AI are already leaving a mark on government by improving the provision of essential social goods and services from healthcare, education, and transportation to food supply, energy, and environmental management. These bounties are likely just the start. The prospect that progress in AI will help government to confront some of its most urgent challenges is exciting, but legitimate worries abound. As with any new and rapidly evolving technology, a steep learning curve means that mistakes and miscalculations will be made and that both unanticipated and harmful impacts will occur. This guide, written for department and delivery leads in the UK public sector and adopted by the British Government in its publication, ‘Using AI in the Public Sector,’ identifies the potential harms caused by AI systems and proposes concrete, operationalisable measures to counteract them. It stresses that public sector organisations can anticipate and prevent these potential harms by stewarding a culture of responsible innovation and by putting in place governance processes that support the design and implementation of ethical, fair, and safe AI systems. It also highlights the need for algorithmically supported outcomes to be interpretable by their users and made understandable to decision subjects in clear, non-technical, and accessible ways. Finally, it builds out a vision of human-centred and context-sensitive implementation that gives a central role to communication, evidence-based reasoning, situational awareness, and moral justifiability.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.05684v1
PDF	https://arxiv.org/pdf/1906.05684v1.pdf
PWC	https://paperswithcode.com/paper/understanding-artificial-intelligence-ethics
Repo
Framework

Encoding Invariances in Deep Generative Models


Title	Encoding Invariances in Deep Generative Models
Authors	Viraj Shah, Ameya Joshi, Sambuddha Ghosal, Balaji Pokuri, Soumik Sarkar, Baskar Ganapathysubramanian, Chinmay Hegde
Abstract	Reliable training of generative adversarial networks (GANs) typically require massive datasets in order to model complicated distributions. However, in several applications, training samples obey invariances that are \textit{a priori} known; for example, in complex physics simulations, the training data obey universal laws encoded as well-defined mathematical equations. In this paper, we propose a new generative modeling approach, InvNet, that can efficiently model data spaces with known invariances. We devise an adversarial training algorithm to encode them into data distribution. We validate our framework in three experimental settings: generating images with fixed motifs; solving nonlinear partial differential equations (PDEs); and reconstructing two-phase microstructures with desired statistical properties. We complement our experiments with several theoretical results.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01626v1
PDF	https://arxiv.org/pdf/1906.01626v1.pdf
PWC	https://paperswithcode.com/paper/encoding-invariances-in-deep-generative
Repo
Framework