October 16, 2019

3298 words 16 mins read

Paper Group ANR 1137

Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks. Admissible Abstractions for Near-optimal Task and Motion Planning. Scalable Bayesian Learning for State Space Models using Variational Inference with SMC Samplers. Incremental Predictive Process Monitoring: How to Deal with the Variability of Real E …

Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks


Title	Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks
Authors	Takuya Yoshioka, Hakan Erdogan, Zhuo Chen, Xiong Xiao, Fil Alleva
Abstract	The goal of this work is to develop a meeting transcription system that can recognize speech even when utterances of different speakers are overlapped. While speech overlaps have been regarded as a major obstacle in accurately transcribing meetings, a traditional beamformer with a single output has been exclusively used because previously proposed speech separation techniques have critical constraints for application to real meetings. This paper proposes a new signal processing module, called an unmixing transducer, and describes its implementation using a windowed BLSTM. The unmixing transducer has a fixed number, say J, of output channels, where J may be different from the number of meeting attendees, and transforms an input multi-channel acoustic signal into J time-synchronous audio streams. Each utterance in the meeting is separated and emitted from one of the output channels. Then, each output signal can be simply fed to a speech recognition back-end for segmentation and transcription. Our meeting transcription system using the unmixing transducer outperforms a system based on a state-of-the-art neural mask-based beamformer by 10.8%. Significant improvements are observed in overlapped segments. To the best of our knowledge, this is the first report that applies overlapped speech recognition to unconstrained real meeting audio.
Tasks	Speech Recognition, Speech Separation
Published	2018-10-08
URL	http://arxiv.org/abs/1810.03655v1
PDF	http://arxiv.org/pdf/1810.03655v1.pdf
PWC	https://paperswithcode.com/paper/recognizing-overlapped-speech-in-meetings-a
Repo
Framework

Admissible Abstractions for Near-optimal Task and Motion Planning


Title	Admissible Abstractions for Near-optimal Task and Motion Planning
Authors	William Vega-Brown, Nicholas Roy
Abstract	We define an admissibility condition for abstractions expressed using angelic semantics and show that these conditions allow us to accelerate planning while preserving the ability to find the optimal motion plan. We then derive admissible abstractions for two motion planning domains with continuous state. We extract upper and lower bounds on the cost of concrete motion plans using local metric and topological properties of the problem domain. These bounds guide the search for a plan while maintaining performance guarantees. We show that abstraction can dramatically reduce the complexity of search relative to a direct motion planner. Using our abstractions, we find near-optimal motion plans in planning problems involving $10^{13}$ states without using a separate task planner.
Tasks	Motion Planning
Published	2018-06-03
URL	http://arxiv.org/abs/1806.00805v1
PDF	http://arxiv.org/pdf/1806.00805v1.pdf
PWC	https://paperswithcode.com/paper/admissible-abstractions-for-near-optimal-task
Repo
Framework

Scalable Bayesian Learning for State Space Models using Variational Inference with SMC Samplers


Title	Scalable Bayesian Learning for State Space Models using Variational Inference with SMC Samplers
Authors	Marcel Hirt, Petros Dellaportas
Abstract	We present a scalable approach to performing approximate fully Bayesian inference in generic state space models. The proposed method is an alternative to particle MCMC that provides fully Bayesian inference of both the dynamic latent states and the static parameters of the model. We build up on recent advances in computational statistics that combine variational methods with sequential Monte Carlo sampling and we demonstrate the advantages of performing full Bayesian inference over the static parameters rather than just performing variational EM approximations. We illustrate how our approach enables scalable inference in multivariate stochastic volatility models and self-exciting point process models that allow for flexible dynamics in the latent intensity function.
Tasks	Bayesian Inference
Published	2018-05-23
URL	http://arxiv.org/abs/1805.09406v3
PDF	http://arxiv.org/pdf/1805.09406v3.pdf
PWC	https://paperswithcode.com/paper/scalable-bayesian-learning-for-state-space
Repo
Framework

Incremental Predictive Process Monitoring: How to Deal with the Variability of Real Environments


Title	Incremental Predictive Process Monitoring: How to Deal with the Variability of Real Environments
Authors	Chiara Di Francescomarino, Chiara Ghidini, Fabrizio Maria Maggi, Williams Rizzi, Cosimo Damiano Persia
Abstract	A characteristic of existing predictive process monitoring techniques is to first construct a predictive model based on past process executions, and then use it to predict the future of new ongoing cases, without the possibility of updating it with new cases when they complete their execution. This can make predictive process monitoring too rigid to deal with the variability of processes working in real environments that continuously evolve and/or exhibit new variant behaviors over time. As a solution to this problem, we propose the use of algorithms that allow the incremental construction of the predictive model. These incremental learning algorithms update the model whenever new cases become available so that the predictive model evolves over time to fit the current circumstances. The algorithms have been implemented using different case encoding strategies and evaluated on a number of real and synthetic datasets. The results provide a first evidence of the potential of incremental learning strategies for predicting process monitoring in real environments, and of the impact of different case encoding strategies in this setting.
Tasks
Published	2018-04-11
URL	http://arxiv.org/abs/1804.03967v1
PDF	http://arxiv.org/pdf/1804.03967v1.pdf
PWC	https://paperswithcode.com/paper/incremental-predictive-process-monitoring-how
Repo
Framework

Attentional Road Safety Networks


Title	Attentional Road Safety Networks
Authors	Sonu Gupta, Deepak Srivatsav, A. V. Subramanyam, Ponnurangam Kumaraguru
Abstract	Road safety mapping using satellite images is a cost-effective but a challenging problem for smart city planning. The scarcity of labeled data, misalignment and ambiguity makes it hard for supervised deep networks to learn efficient embeddings in order to classify between safe and dangerous road segments. In this paper, we address the challenges using a region guided attention network. In our model, we extract global features from a base network and augment it with local features obtained using the region guided attention network. In addition, we perform domain adaptation for unlabeled target data. In order to bridge the gap between safe samples and dangerous samples from source and target respectively, we propose a loss function based on within and between class covariance matrices. We conduct experiments on a public dataset of London to show that the algorithm achieves significant results with the classification accuracy of 86.21%. We obtain an increase of 4% accuracy for NYC using domain adaptation network. Besides, we perform a user study and demonstrate that our proposed algorithm achieves 23.12% better accuracy compared to subjective analysis.
Tasks	Domain Adaptation
Published	2018-12-12
URL	http://arxiv.org/abs/1812.04860v2
PDF	http://arxiv.org/pdf/1812.04860v2.pdf
PWC	https://paperswithcode.com/paper/attentional-road-safety-networks
Repo
Framework

Gatekeeping Algorithms with Human Ethical Bias: The ethics of algorithms in archives, libraries and society


Title	Gatekeeping Algorithms with Human Ethical Bias: The ethics of algorithms in archives, libraries and society
Authors	Martijn van Otterlo
Abstract	In the age of algorithms, I focus on the question of how to ensure algorithms that will take over many of our familiar archival and library tasks, will behave according to human ethical norms that have evolved over many years. I start by characterizing physical archives in the context of related institutions such as libraries and museums. In this setting I analyze how ethical principles, in particular about access to information, have been formalized and communicated in the form of ethical codes, or: codes of conducts. After that I describe two main developments: digitalization, in which physical aspects of the world are turned into digital data, and algorithmization, in which intelligent computer programs turn this data into predictions and decisions. Both affect interactions that were once physical but now digital. In this new setting I survey and analyze the ethical aspects of algorithms and how they shape a vision on the future of archivists and librarians, in the form of algorithmic documentalists, or: codementalists. Finally I outline a general research strategy, called IntERMEeDIUM, to obtain algorithms that obey are human ethical values encoded in code of ethics.
Tasks
Published	2018-01-05
URL	http://arxiv.org/abs/1801.01705v1
PDF	http://arxiv.org/pdf/1801.01705v1.pdf
PWC	https://paperswithcode.com/paper/gatekeeping-algorithms-with-human-ethical
Repo
Framework

Learning Deep Context-Network Architectures for Image Annotation


Title	Learning Deep Context-Network Architectures for Image Annotation
Authors	Mingyuan Jiu, Hichem Sahbi
Abstract	Context plays an important role in visual pattern recognition as it provides complementary clues for different learning tasks including image classification and annotation. In the particular scenario of kernel learning, the general recipe of context-based kernel design consists in learning positive semi-definite similarity functions that return high values not only when data share similar content but also similar context. However, in spite of having a positive impact on performance, the use of context in these kernel design methods has not been fully explored; indeed, context has been handcrafted instead of being learned. In this paper, we introduce a novel context-aware kernel design framework based on deep learning. Our method discriminatively learns spatial geometric context as the weights of a deep network (DN). The architecture of this network is fully determined by the solution of an objective function that mixes content, context and regularization, while the parameters of this network determine the most relevant (discriminant) parts of the learned context. We apply this context and kernel learning framework to image classification using the challenging ImageCLEF Photo Annotation benchmark; the latter shows that our deep context learning provides highly effective kernels for image classification as corroborated through extensive experiments.
Tasks	Image Classification
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08794v1
PDF	http://arxiv.org/pdf/1803.08794v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-context-network-architectures
Repo
Framework

An Efficient Bandit Algorithm for Realtime Multivariate Optimization


Title	An Efficient Bandit Algorithm for Realtime Multivariate Optimization
Authors	Daniel N Hill, Houssam Nassif, Yi Liu, Anand Iyer, S V N Vishwanathan
Abstract	Optimization is commonly employed to determine the content of web pages, such as to maximize conversions on landing pages or click-through rates on search engine result pages. Often the layout of these pages can be decoupled into several separate decisions. For example, the composition of a landing page may involve deciding which image to show, which wording to use, what color background to display, etc. Such optimization is a combinatorial problem over an exponentially large decision space. Randomized experiments do not scale well to this setting, and therefore, in practice, one is typically limited to optimizing a single aspect of a web page at a time. This represents a missed opportunity in both the speed of experimentation and the exploitation of possible interactions between layout decisions. Here we focus on multivariate optimization of interactive web pages. We formulate an approach where the possible interactions between different components of the page are modeled explicitly. We apply bandit methodology to explore the layout space efficiently and use hill-climbing to select optimal content in realtime. Our algorithm also extends to contextualization and personalization of layout selection. Simulation results show the suitability of our approach to large decision spaces with strong interactions between content. We further apply our algorithm to optimize a message that promotes adoption of an Amazon service. After only a single week of online optimization, we saw a 21% conversion increase compared to the median layout. Our technique is currently being deployed to optimize content across several locations at Amazon.com.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09558v1
PDF	http://arxiv.org/pdf/1810.09558v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-bandit-algorithm-for-realtime
Repo
Framework

Developing a Purely Visual Based Obstacle Detection using Inverse Perspective Mapping


Title	Developing a Purely Visual Based Obstacle Detection using Inverse Perspective Mapping
Authors	Julian Nubert, Niklas Funk, Fabio Meier, Fabrice Oehler
Abstract	Our solution is implemented in and for the frame of Duckietown. The goal of Duckietown is to provide a relatively simple platform to explore, tackle and solve many problems linked to autonomous driving. “Duckietown” is simple in the basics, but an infinitely expandable environment. From controlling single driving Duckiebots until complete fleet management, every scenario is possible and can be put into practice. So far, none of the existing modules was capable of reliably detecting obstacles and reacting to them in real time. We faced the general problem of detecting obstacles given images from a monocular RGB camera mounted at the front of our Duckiebot and reacting to them properly without crashing or erroneously stopping the Duckiebot. Both, the detection as well as the reaction have to be implemented and have to run on a Raspberry Pi in real time. Due to the strong hardware limitations, we decided to not use any learning algorithms for the obstacle detection part. As it later transpired, a working “hard coded” software needs thorough analysis and understanding of the given problem. In layman’s terms, we simply seek to make Duckietown a safer place.
Tasks	Autonomous Driving
Published	2018-09-04
URL	http://arxiv.org/abs/1809.01268v1
PDF	http://arxiv.org/pdf/1809.01268v1.pdf
PWC	https://paperswithcode.com/paper/developing-a-purely-visual-based-obstacle
Repo
Framework

A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language Instructions


Title	A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language Instructions
Authors	Aly Magassouba, Komei Sugiura, Hisashi Kawai
Abstract	This paper focuses on a multimodal language understanding method for carry-and-place tasks with domestic service robots. We address the case of ambiguous instructions, that is, when the target area is not specified. For instance “put away the milk and cereal” is a natural instruction where there is ambiguity regarding the target area, considering environments in daily life. Conventionally, this instruction can be disambiguated from a dialogue system, but at the cost of time and cumbersome interaction. Instead, we propose a multimodal approach, in which the instructions are disambiguated using the robot’s state and environment context. We develop the Multi-Modal Classifier Generative Adversarial Network (MMC-GAN) to predict the likelihood of different target areas considering the robot’s physical limitation and the target clutter. Our approach, MMC-GAN, significantly improves accuracy compared with baseline methods that use instructions only or simple deep neural networks.
Tasks
Published	2018-06-11
URL	http://arxiv.org/abs/1806.03847v1
PDF	http://arxiv.org/pdf/1806.03847v1.pdf
PWC	https://paperswithcode.com/paper/a-multimodal-classifier-generative
Repo
Framework

Using Swarm Optimization To Enhance Autoencoders Images


Title	Using Swarm Optimization To Enhance Autoencoders Images
Authors	Maisa Doaud, Michael Mayo
Abstract	Autoencoders learn data representations through reconstruction. Robust training is the key factor affecting the quality of the learned representations and, consequently, the accuracy of the application that use them. Previous works suggested methods for deciding the optimal autoencoder configuration which allows for robust training. Nevertheless, improving the accuracy of a trained autoencoder has got limited, if no, attention. We propose a new approach that improves the accuracy of a trained autoencoders results and answers the following question, Given a trained autoencoder, a test image, and using a real-parameter optimizer, can we generate better quality reconstructed image version than the one generated by the autoencoder?. Our proposed approach combines both the decoder part of a trained Resitricted Boltman Machine-based autoencoder with the Competitive Swarm Optimization algorithm. Experiments show that it is possible to reconstruct images using trained decoder from randomly initialized representations. Results also show that our approach reconstructed better quality images than the autoencoder in most of the test cases. Indicating that, we can use the approach for improving the performance of a pre-trained autoencoder if it does not give satisfactory results.
Tasks
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03346v1
PDF	http://arxiv.org/pdf/1807.03346v1.pdf
PWC	https://paperswithcode.com/paper/using-swarm-optimization-to-enhance
Repo
Framework

Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more


Title	Generic Coreset for Scalable Learning of Monotonic Kernels: Logistic Regression, Sigmoid and more
Authors	Elad Tolochinsky, Dan Feldman
Abstract	Coreset (or core-set) in this paper is a small weighted \emph{subset} $Q$ of the input set $P$ with respect to a given \emph{monotonic} function $\phi:\REAL\to\REAL$ that \emph{provably} approximates its fitting loss $\sum_{p\in P}f(p\cdot x)$ to \emph{any} given $x\in\REAL^d$. Using $Q$ we can obtain approximation of $x^*$ that minimizes this loss, by running \emph{existing} optimization algorithms on $Q$. We provide: (I) a lower bound that proves that there are sets with no coresets smaller than $n=P$ , (II) a proof that a small coreset of size near-logarithmic in $n$ exists for \emph{any} input $P$, under natural assumption that holds e.g. for logistic regression and the sigmoid activation function. (III) a generic algorithm that computes $Q$ in $O(nd+n\log n)$ expected time, (IV) extensive experimental results with open code and benchmarks that show that the coresets are even smaller in practice. Existing papers (e.g.[Huggins,Campbell,Broderick 2016]) suggested only specific coresets for specific input sets.
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07382v2
PDF	http://arxiv.org/pdf/1802.07382v2.pdf
PWC	https://paperswithcode.com/paper/generic-coreset-for-scalable-learning-of
Repo
Framework

Offline A/B testing for Recommender Systems


Title	Offline A/B testing for Recommender Systems
Authors	Alexandre Gilotte, Clément Calauzènes, Thomas Nedelec, Alexandre Abraham, Simon Dollé
Abstract	Before A/B testing online a new version of a recommender system, it is usual to perform some offline evaluations on historical data. We focus on evaluation methods that compute an estimator of the potential uplift in revenue that could generate this new technology. It helps to iterate faster and to avoid losing money by detecting poor policies. These estimators are known as counterfactual or off-policy estimators. We show that traditional counterfactual estimators such as capped importance sampling and normalised importance sampling are experimentally not having satisfying bias-variance compromises in the context of personalised product recommendation for online advertising. We propose two variants of counterfactual estimates with different modelling of the bias that prove to be accurate in real-world conditions. We provide a benchmark of these estimators by showing their correlation with business metrics observed by running online A/B tests on a commercial recommender system.
Tasks	Product Recommendation, Recommendation Systems
Published	2018-01-22
URL	http://arxiv.org/abs/1801.07030v1
PDF	http://arxiv.org/pdf/1801.07030v1.pdf
PWC	https://paperswithcode.com/paper/offline-ab-testing-for-recommender-systems
Repo
Framework

Constrained Sparse Subspace Clustering with Side-Information


Title	Constrained Sparse Subspace Clustering with Side-Information
Authors	Chun-Guang Li, Junjian Zhang, Jun Guo
Abstract	Subspace clustering refers to the problem of segmenting high dimensional data drawn from a union of subspaces into the respective subspaces. In some applications, partial side-information to indicate “must-link” or “cannot-link” in clustering is available. This leads to the task of subspace clustering with side-information. However, in prior work the supervision value of the side-information for subspace clustering has not been fully exploited. To this end, in this paper, we present an enhanced approach for constrained subspace clustering with side-information, termed Constrained Sparse Subspace Clustering plus (CSSC+), in which the side-information is used not only in the stage of learning an affinity matrix but also in the stage of spectral clustering. Moreover, we propose to estimate clustering accuracy based on the partial side-information and theoretically justify the connection to the ground-truth clustering accuracy in terms of the Rand index. We conduct experiments on three cancer gene expression datasets to validate the effectiveness of our proposals.
Tasks
Published	2018-05-21
URL	http://arxiv.org/abs/1805.08183v2
PDF	http://arxiv.org/pdf/1805.08183v2.pdf
PWC	https://paperswithcode.com/paper/constrained-sparse-subspace-clustering-with
Repo
Framework

Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors


Title	Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors
Authors	Hao Yu, Zhaoning Zhang, Zheng Qin, Hao Wu, Dongsheng Li, Jun Zhao, Xicheng Lu
Abstract	Modern object detectors usually suffer from low accuracy issues, as foregrounds always drown in tons of backgrounds and become hard examples during training. Compared with those proposal-based ones, real-time detectors are in far more serious trouble since they renounce the use of region-proposing stage which is used to filter a majority of backgrounds for achieving real-time rates. Though foregrounds as hard examples are in urgent need of being mined from tons of backgrounds, a considerable number of state-of-the-art real-time detectors, like YOLO series, have yet to profit from existing hard example mining methods, as using these methods need detectors fit series of prerequisites. In this paper, we propose a general hard example mining method named Loss Rank Mining (LRM) to fill the gap. LRM is a general method for real-time detectors, as it utilizes the final feature map which exists in all real-time detectors to mine hard examples. By using LRM, some elements representing easy examples in final feature map are filtered and detectors are forced to concentrate on hard examples during training. Extensive experiments validate the effectiveness of our method. With our method, the improvements of YOLOv2 detector on auto-driving related dataset KITTI and more general dataset PASCAL VOC are over 5% and 2% mAP, respectively. In addition, LRM is the first hard example mining strategy which could fit YOLOv2 perfectly and make it better applied in series of real scenarios where both real-time rates and accurate detection are strongly demanded.
Tasks
Published	2018-04-10
URL	http://arxiv.org/abs/1804.04606v1
PDF	http://arxiv.org/pdf/1804.04606v1.pdf
PWC	https://paperswithcode.com/paper/loss-rank-mining-a-general-hard-example
Repo
Framework