July 28, 2019

2803 words 14 mins read

Paper Group ANR 330

Discover and Learn New Objects from Documentaries. Data-Dependent Stability of Stochastic Gradient Descent. Minimax Rates and Efficient Algorithms for Noisy Sorting. On the Consistency of $k$-means++ algorithm. Role of Deep LSTM Neural Networks And WiFi Networks in Support of Occupancy Prediction in Smart Buildings. Enabling Multi-Source Neural Mac …

Discover and Learn New Objects from Documentaries


Title	Discover and Learn New Objects from Documentaries
Authors	Kai Chen, Hang Song, Chen Change Loy, Dahua Lin
Abstract	Despite the remarkable progress in recent years, detecting objects in a new context remains a challenging task. Detectors learned from a public dataset can only work with a fixed list of categories, while training from scratch usually requires a large amount of training data with detailed annotations. This work aims to explore a novel approach – learning object detectors from documentary films in a weakly supervised manner. This is inspired by the observation that documentaries often provide dedicated exposition of certain object categories, where visual presentations are aligned with subtitles. We believe that object detectors can be learned from such a rich source of information. Towards this goal, we develop a joint probabilistic framework, where individual pieces of information, including video frames and subtitles, are brought together via both visual and linguistic links. On top of this formulation, we further derive a weakly supervised learning algorithm, where object model learning and training set mining are unified in an optimization procedure. Experimental results on a real world dataset demonstrate that this is an effective approach to learning new object detectors.
Tasks
Published	2017-07-30
URL	http://arxiv.org/abs/1707.09593v1
PDF	http://arxiv.org/pdf/1707.09593v1.pdf
PWC	https://paperswithcode.com/paper/discover-and-learn-new-objects-from
Repo
Framework

Data-Dependent Stability of Stochastic Gradient Descent


Title	Data-Dependent Stability of Stochastic Gradient Descent
Authors	Ilja Kuzborskij, Christoph H. Lampert
Abstract	We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In both cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient.
Tasks
Published	2017-03-05
URL	http://arxiv.org/abs/1703.01678v4
PDF	http://arxiv.org/pdf/1703.01678v4.pdf
PWC	https://paperswithcode.com/paper/data-dependent-stability-of-stochastic
Repo
Framework

Minimax Rates and Efficient Algorithms for Noisy Sorting


Title	Minimax Rates and Efficient Algorithms for Noisy Sorting
Authors	Cheng Mao, Jonathan Weed, Philippe Rigollet
Abstract	There has been a recent surge of interest in studying permutation-based models for ranking from pairwise comparison data. Despite being structurally richer and more robust than parametric ranking models, permutation-based models are less well understood statistically and generally lack efficient learning algorithms. In this work, we study a prototype of permutation-based ranking models, namely, the noisy sorting model. We establish the optimal rates of learning the model under two sampling procedures. Furthermore, we provide a fast algorithm to achieve near-optimal rates if the observations are sampled independently. Along the way, we discover properties of the symmetric group which are of theoretical interest.
Tasks
Published	2017-10-28
URL	http://arxiv.org/abs/1710.10388v1
PDF	http://arxiv.org/pdf/1710.10388v1.pdf
PWC	https://paperswithcode.com/paper/minimax-rates-and-efficient-algorithms-for
Repo
Framework

On the Consistency of $k$-means++ algorithm


Title	On the Consistency of $k$-means++ algorithm
Authors	Mieczysław A. Kłopotek
Abstract	We prove in this paper that the expected value of the objective function of the $k$-means++ algorithm for samples converges to population expected value. As $k$-means++, for samples, provides with constant factor approximation for $k$-means objectives, such an approximation can be achieved for the population with increase of the sample size. This result is of potential practical relevance when one is considering using subsampling when clustering large data sets (large data bases).
Tasks
Published	2017-02-20
URL	http://arxiv.org/abs/1702.06120v1
PDF	http://arxiv.org/pdf/1702.06120v1.pdf
PWC	https://paperswithcode.com/paper/on-the-consistency-of-k-means-algorithm
Repo
Framework

Role of Deep LSTM Neural Networks And WiFi Networks in Support of Occupancy Prediction in Smart Buildings


Title	Role of Deep LSTM Neural Networks And WiFi Networks in Support of Occupancy Prediction in Smart Buildings
Authors	Basheer Qolomany, Ala Al-Fuqaha, Driss Benhaddou, Ajay Gupta
Abstract	Knowing how many people occupy a building, and where they are located, is a key component of smart building services. Commercial, industrial and residential buildings often incorporate systems used to determine occupancy. However, relatively simple sensor technology and control algorithms limit the effectiveness of smart building services. In this paper we propose to replace sensor technology with time series models that can predict the number of occupants at a given location and time. We use Wi-Fi data sets readily available in abundance for smart building services and train Auto Regression Integrating Moving Average (ARIMA) models and Long Short-Term Memory (LSTM) time series models. As a use case scenario of smart building services, these models allow forecasting of the number of people at a given time and location in 15, 30 and 60 minutes time intervals at building as well as Access Point (AP) level. For LSTM, we build our models in two ways: a separate model for every time scale, and a combined model for the three time scales. Our experiments show that LSTM combined model reduced the computational resources with respect to the number of neurons by 74.48 % for the AP level, and by 67.13 % for the building level. Further, the root mean square error (RMSE) was reduced by 88.2% - 93.4% for LSTM in comparison to ARIMA for the building levels models and by 80.9% - 87% for the AP level models.
Tasks	Time Series
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10355v1
PDF	http://arxiv.org/pdf/1711.10355v1.pdf
PWC	https://paperswithcode.com/paper/role-of-deep-lstm-neural-networks-and-wifi
Repo
Framework

Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages


Title	Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages
Authors	Raj Dabre, Fabien Cromieres, Sadao Kurohashi
Abstract	In this paper, we explore a simple solution to “Multi-Source Neural Machine Translation” (MSNMT) which only relies on preprocessing a N-way multilingual corpus without modifying the Neural Machine Translation (NMT) architecture or training procedure. We simply concatenate the source sentences to form a single long multi-source input sentence while keeping the target side sentence as it is and train an NMT system using this preprocessed corpus. We evaluate our method in resource poor as well as resource rich settings and show its effectiveness (up to 4 BLEU using 2 source languages and up to 6 BLEU using 5 source languages). We also compare against existing methods for MSNMT and show that our solution gives competitive results despite its simplicity. We also provide some insights on how the NMT system leverages multilingual information in such a scenario by visualizing attention.
Tasks	Machine Translation
Published	2017-02-20
URL	http://arxiv.org/abs/1702.06135v4
PDF	http://arxiv.org/pdf/1702.06135v4.pdf
PWC	https://paperswithcode.com/paper/enabling-multi-source-neural-machine
Repo
Framework

Scaling Properties of Human Brain Functional Networks


Title	Scaling Properties of Human Brain Functional Networks
Authors	Riccardo Zucca, Xerxes D. Arsiwalla, Hoang Le, Mikail Rubinov, Paul Verschure
Abstract	We investigate scaling properties of human brain functional networks in the resting-state. Analyzing network degree distributions, we statistically test whether their tails scale as power-law or not. Initial studies, based on least-squares fitting, were shown to be inadequate for precise estimation of power-law distributions. Subsequently, methods based on maximum-likelihood estimators have been proposed and applied to address this question. Nevertheless, no clear consensus has emerged, mainly because results have shown substantial variability depending on the data-set used or its resolution. In this study, we work with high-resolution data (10K nodes) from the Human Connectome Project and take into account network weights. We test for the power-law, exponential, log-normal and generalized Pareto distributions. Our results show that the statistics generally do not support a power-law, but instead these degree distributions tend towards the thin-tail limit of the generalized Pareto model. This may have implications for the number of hubs in human brain functional networks.
Tasks
Published	2017-02-02
URL	http://arxiv.org/abs/1702.00768v1
PDF	http://arxiv.org/pdf/1702.00768v1.pdf
PWC	https://paperswithcode.com/paper/scaling-properties-of-human-brain-functional
Repo
Framework

Irregular Convolutional Neural Networks


Title	Irregular Convolutional Neural Networks
Authors	Jiabin Ma, Wei Wang, Liang Wang
Abstract	Convolutional kernels are basic and vital components of deep Convolutional Neural Networks (CNN). In this paper, we equip convolutional kernels with shape attributes to generate the deep Irregular Convolutional Neural Networks (ICNN). Compared to traditional CNN applying regular convolutional kernels like ${3\times3}$, our approach trains irregular kernel shapes to better fit the geometric variations of input features. In other words, shapes are learnable parameters in addition to weights. The kernel shapes and weights are learned simultaneously during end-to-end training with the standard back-propagation algorithm. Experiments for semantic segmentation are implemented to validate the effectiveness of our proposed ICNN.
Tasks	Semantic Segmentation
Published	2017-06-24
URL	http://arxiv.org/abs/1706.07966v1
PDF	http://arxiv.org/pdf/1706.07966v1.pdf
PWC	https://paperswithcode.com/paper/irregular-convolutional-neural-networks
Repo
Framework

Integration of Japanese Papers Into the DBLP Data Set


Title	Integration of Japanese Papers Into the DBLP Data Set
Authors	Paul Christian Sommerhoff
Abstract	If someone is looking for a certain publication in the field of computer science, the searching person is likely to use the DBLP to find the desired publication. The DBLP data set is continuously extended with new publications, or rather their metadata, for example the names of involved authors, the title and the publication date. While the size of the data set is already remarkable, specific areas can still be improved. The DBLP offers a huge collection of English papers because most papers concerning computer science are published in English. Nevertheless, there are official publications in other languages which are supposed to be added to the data set. One kind of these are Japanese papers. This diploma thesis will show a way to automatically process publication lists of Japanese papers and to make them ready for an import into the DBLP data set. Especially important are the problems along the way of processing, such as transcription handling and Personal Name Matching with Japanese names.
Tasks
Published	2017-09-26
URL	http://arxiv.org/abs/1709.09119v1
PDF	http://arxiv.org/pdf/1709.09119v1.pdf
PWC	https://paperswithcode.com/paper/integration-of-japanese-papers-into-the-dblp
Repo
Framework

Survey on Models and Techniques for Root-Cause Analysis


Title	Survey on Models and Techniques for Root-Cause Analysis
Authors	Marc Solé, Victor Muntés-Mulero, Annie Ibrahim Rana, Giovani Estrada
Abstract	Automation and computer intelligence to support complex human decisions becomes essential to manage large and distributed systems in the Cloud and IoT era. Understanding the root cause of an observed symptom in a complex system has been a major problem for decades. As industry dives into the IoT world and the amount of data generated per year grows at an amazing speed, an important question is how to find appropriate mechanisms to determine root causes that can handle huge amounts of data or may provide valuable feedback in real-time. While many survey papers aim at summarizing the landscape of techniques for modelling system behavior and infering the root cause of a problem based in the resulting models, none of those focuses on analyzing how the different techniques in the literature fit growing requirements in terms of performance and scalability. In this survey, we provide a review of root-cause analysis, focusing on these particular aspects. We also provide guidance to choose the best root-cause analysis strategy depending on the requirements of a particular system and application.
Tasks
Published	2017-01-30
URL	http://arxiv.org/abs/1701.08546v2
PDF	http://arxiv.org/pdf/1701.08546v2.pdf
PWC	https://paperswithcode.com/paper/survey-on-models-and-techniques-for-root
Repo
Framework

Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH


Title	Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH
Authors	Luis Argerich, Natalia Golmar
Abstract	In this paper we propose the creation of generic LSH families for the angular distance based on Johnson-Lindenstrauss projections. We show that feature hashing is a valid J-L projection and propose two new LSH families based on feature hashing. These new LSH families are tested on both synthetic and real datasets with very good results and a considerable performance improvement over other LSH families. While the theoretical analysis is done for the angular distance, these families can also be used in practice for the euclidean distance with excellent results [2]. Our tests using real datasets show that the proposed LSH functions work well for the euclidean distance.
Tasks
Published	2017-04-15
URL	http://arxiv.org/abs/1704.04684v1
PDF	http://arxiv.org/pdf/1704.04684v1.pdf
PWC	https://paperswithcode.com/paper/generic-lsh-families-for-the-angular-distance
Repo
Framework

Model-Based Clustering of Nonparametric Weighted Networks with Application to Water Pollution Analysis


Title	Model-Based Clustering of Nonparametric Weighted Networks with Application to Water Pollution Analysis
Authors	Amal Agarwal, Lingzhou Xue
Abstract	Water pollution is a major global environmental problem, and it poses a great environmental risk to public health and biological diversity. This work is motivated by assessing the potential environmental threat of coal mining through increased sulfate concentrations in river networks, which do not belong to any simple parametric distribution. However, existing network models mainly focus on binary or discrete networks and weighted networks with known parametric weight distributions. We propose a principled nonparametric weighted network model based on exponential-family random graph models and local likelihood estimation and study its model-based clustering with application to large-scale water pollution network analysis. We do not require any parametric distribution assumption on network weights. The proposed method greatly extends the methodology and applicability of statistical network models. Furthermore, it is scalable to large and complex networks in large-scale environmental studies. The power of our proposed methods is demonstrated in simulation studies and a real application to sulfate pollution network analysis in Ohio watershed located in Pennsylvania, United States.
Tasks
Published	2017-12-21
URL	https://arxiv.org/abs/1712.07800v2
PDF	https://arxiv.org/pdf/1712.07800v2.pdf
PWC	https://paperswithcode.com/paper/model-based-clustering-of-nonparametric
Repo
Framework

Learning of Colors from Color Names: Distribution and Point Estimation


Title	Learning of Colors from Color Names: Distribution and Point Estimation
Authors	Lyndon White, Roberto Togneri, Wei Liu, Mohammed Bennamoun
Abstract	Color names are often made up of multiple words. As a task in natural language understanding we investigate in depth the capacity of neural networks based on sums of word embeddings (SOWE), recurrence (LSTM and GRU based RNNs) and convolution (CNN), to estimate colors from sequences of terms. We consider both point and distribution estimates of color. We argue that the latter has a particular value as there is no clear agreement between people as to what a particular color describes – different people have a different idea of what it means to be ``very dark orange’', for example. Surprisingly, despite it’s simplicity, the sum of word embeddings generally performs the best on almost all evaluations. \|
Tasks	Word Embeddings
Published	2017-09-27
URL	https://arxiv.org/abs/1709.09360v3
PDF	https://arxiv.org/pdf/1709.09360v3.pdf
PWC	https://paperswithcode.com/paper/learning-distributions-of-meant-color
Repo
Framework

Counterfactual Control for Free from Generative Models


Title	Counterfactual Control for Free from Generative Models
Authors	Nicholas Guttenberg, Yen Yu, Ryota Kanai
Abstract	We introduce a method by which a generative model learning the joint distribution between actions and future states can be used to automatically infer a control scheme for any desired reward function, which may be altered on the fly without retraining the model. In this method, the problem of action selection is reduced to one of gradient descent on the latent space of the generative model, with the model itself providing the means of evaluating outcomes and finding the gradient, much like how the reward network in Deep Q-Networks (DQN) provides gradient information for the action generator. Unlike DQN or Actor-Critic, which are conditional models for a specific reward, using a generative model of the full joint distribution permits the reward to be changed on the fly. In addition, the generated futures can be inspected to gain insight in to what the network ‘thinks’ will happen, and to what went wrong when the outcomes deviate from prediction.
Tasks
Published	2017-02-22
URL	http://arxiv.org/abs/1702.06676v2
PDF	http://arxiv.org/pdf/1702.06676v2.pdf
PWC	https://paperswithcode.com/paper/counterfactual-control-for-free-from
Repo
Framework

On the ERM Principle with Networked Data


Title	On the ERM Principle with Networked Data
Authors	Yuanhong Wang, Yuyi Wang, Xingwu Liu, Juhua Pu
Abstract	Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d.\ assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the classical empirical risk minimization (ERM) principle that always weights every example equally, but this strategy leads to unsatisfactory bounds. We consider general weighted ERM and show new universal risk bounds for this problem. These new bounds naturally define an optimization problem which leads to appropriate weights for networked examples. Though this optimization problem is not convex in general, we devise a new fully polynomial-time approximation scheme (FPTAS) to solve it.
Tasks	Learning-To-Rank, Link Prediction
Published	2017-11-12
URL	http://arxiv.org/abs/1711.04297v2
PDF	http://arxiv.org/pdf/1711.04297v2.pdf
PWC	https://paperswithcode.com/paper/on-the-erm-principle-with-networked-data
Repo
Framework