Paper Group ANR 384
End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient. Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?. Pillar Networks++: Distributed non-parametric deep and wide networks. Coalescent-based species tree estimation: a stochastic Farris transform. Independent Motion Detection with Event-dri …
End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient
Title | End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient |
Authors | Li Zhou, Kevin Small, Oleg Rokhlenko, Charles Elkan |
Abstract | Learning a goal-oriented dialog policy is generally performed offline with supervised learning algorithms or online with reinforcement learning (RL). Additionally, as companies accumulate massive quantities of dialog transcripts between customers and trained human agents, encoder-decoder methods have gained popularity as agent utterances can be directly treated as supervision without the need for utterance-level annotations. However, one potential drawback of such approaches is that they myopically generate the next agent utterance without regard for dialog-level considerations. To resolve this concern, this paper describes an offline RL method for learning from unannotated corpora that can optimize a goal-oriented policy at both the utterance and dialog level. We introduce a novel reward function and use both on-policy and off-policy policy gradient to learn a policy offline without requiring online user interaction or an explicit state space definition. |
Tasks | Goal-Oriented Dialog |
Published | 2017-12-07 |
URL | http://arxiv.org/abs/1712.02838v1 |
http://arxiv.org/pdf/1712.02838v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-offline-goal-oriented-dialog |
Repo | |
Framework | |
Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?
Title | Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design? |
Authors | Monika Doerfler, Thomas Grill, Roswitha Bammer, Arthur Flexer |
Abstract | When convolutional neural networks are used to tackle learning problems based on music or, more generally, time series data, raw one-dimensional data are commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients, which are then used as input to the actual neural network. In this contribution, we investigate, both theoretically and experimentally, the influence of this pre-processing step on the network’s performance and pose the question, whether replacing it by applying adaptive or learned filters directly to the raw data, can improve learning success. The theoretical results show that approximately reproducing mel-spectrogram coefficients by applying adaptive filters and subsequent time-averaging is in principle possible. We also conducted extensive experimental work on the task of singing voice detection in music. The results of these experiments show that for classification based on Convolutional Neural Networks the features obtained from adaptive filter banks followed by time-averaging perform better than the canonical Fourier-transform-based mel-spectrogram coefficients. Alternative adaptive approaches with center frequencies or time-averaging lengths learned from training data perform equally well. |
Tasks | Time Series |
Published | 2017-09-07 |
URL | http://arxiv.org/abs/1709.02291v3 |
http://arxiv.org/pdf/1709.02291v3.pdf | |
PWC | https://paperswithcode.com/paper/basic-filters-for-convolutional-neural |
Repo | |
Framework | |
Pillar Networks++: Distributed non-parametric deep and wide networks
Title | Pillar Networks++: Distributed non-parametric deep and wide networks |
Authors | Biswa Sengupta, Yu Qian |
Abstract | In recent work, it was shown that combining multi-kernel based support vector machines (SVMs) can lead to near state-of-the-art performance on an action recognition dataset (HMDB-51 dataset). This was 0.4% lower than frameworks that used hand-crafted features in addition to the deep convolutional feature extractors. In the present work, we show that combining distributed Gaussian Processes with multi-stream deep convolutional neural networks (CNN) alleviate the need to augment a neural network with hand-crafted features. In contrast to prior work, we treat each deep neural convolutional network as an expert wherein the individual predictions (and their respective uncertainties) are combined into a Product of Experts (PoE) framework. |
Tasks | Gaussian Processes, Temporal Action Localization |
Published | 2017-08-18 |
URL | http://arxiv.org/abs/1708.06250v1 |
http://arxiv.org/pdf/1708.06250v1.pdf | |
PWC | https://paperswithcode.com/paper/pillar-networks-distributed-non-parametric |
Repo | |
Framework | |
Coalescent-based species tree estimation: a stochastic Farris transform
Title | Coalescent-based species tree estimation: a stochastic Farris transform |
Authors | Gautam Dasarathy, Elchanan Mossel, Robert Nowak, Sebastien Roch |
Abstract | The reconstruction of a species phylogeny from genomic data faces two significant hurdles: 1) the trees describing the evolution of each individual gene–i.e., the gene trees–may differ from the species phylogeny and 2) the molecular sequences corresponding to each gene often provide limited information about the gene trees themselves. In this paper we consider an approach to species tree reconstruction that addresses both these hurdles. Specifically, we propose an algorithm for phylogeny reconstruction under the multispecies coalescent model with a standard model of site substitution. The multispecies coalescent is commonly used to model gene tree discordance due to incomplete lineage sorting, a well-studied population-genetic effect. In previous work, an information-theoretic trade-off was derived in this context between the number of loci, $m$, needed for an accurate reconstruction and the length of the locus sequences, $k$. It was shown that to reconstruct an internal branch of length $f$, one needs $m$ to be of the order of $1/[f^{2} \sqrt{k}]$. That previous result was obtained under the molecular clock assumption, i.e., under the assumption that mutation rates (as well as population sizes) are constant across the species phylogeny. Here we generalize this result beyond the restrictive molecular clock assumption, and obtain a new reconstruction algorithm that has the same data requirement (up to log factors). Our main contribution is a novel reduction to the molecular clock case under the multispecies coalescent. As a corollary, we also obtain a new identifiability result of independent interest: for any species tree with $n \geq 3$ species, the rooted species tree can be identified from the distribution of its unrooted weighted gene trees even in the absence of a molecular clock. |
Tasks | |
Published | 2017-07-13 |
URL | http://arxiv.org/abs/1707.04300v1 |
http://arxiv.org/pdf/1707.04300v1.pdf | |
PWC | https://paperswithcode.com/paper/coalescent-based-species-tree-estimation-a |
Repo | |
Framework | |
Independent Motion Detection with Event-driven Cameras
Title | Independent Motion Detection with Event-driven Cameras |
Authors | Valentina Vasco, Arren Glover, Elias Mueggler, Davide Scaramuzza, Lorenzo Natale, Chiara Bartolozzi |
Abstract | Unlike standard cameras that send intensity images at a constant frame rate, event-driven cameras asynchronously report pixel-level brightness changes, offering low latency and high temporal resolution (both in the order of micro-seconds). As such, they have great potential for fast and low power vision algorithms for robots. Visual tracking, for example, is easily achieved even for very fast stimuli, as only moving objects cause brightness changes. However, cameras mounted on a moving robot are typically non-stationary and the same tracking problem becomes confounded by background clutter events due to the robot ego-motion. In this paper, we propose a method for segmenting the motion of an independently moving object for event-driven cameras. Our method detects and tracks corners in the event stream and learns the statistics of their motion as a function of the robot’s joint velocities when no independently moving objects are present. During robot operation, independently moving objects are identified by discrepancies between the predicted corner velocities from ego-motion and the measured corner velocities. We validate the algorithm on data collected from the neuromorphic iCub robot. We achieve a precision of ~ 90 % and show that the method is robust to changes in speed of both the head and the target. |
Tasks | Motion Detection, Visual Tracking |
Published | 2017-06-27 |
URL | http://arxiv.org/abs/1706.08713v2 |
http://arxiv.org/pdf/1706.08713v2.pdf | |
PWC | https://paperswithcode.com/paper/independent-motion-detection-with-event |
Repo | |
Framework | |
Making the best of data derived from a daily practice in clinical legal medicine for research and practice - the example of Spe3dLab
Title | Making the best of data derived from a daily practice in clinical legal medicine for research and practice - the example of Spe3dLab |
Authors | Vincent Laugier, Eric Stindel, Alcibiade Lichterowicz, Séverine Ansart, Thomas Lefèvre |
Abstract | Forensic science suffers from a lack of studies with high-quality design, such as randomized controlled trials (RCT). Evidence in forensic science may be of insufficient quality, which is a major concern. Results from RCT are criticized for providing artificial results that are not useful in real life and unfit for individualized prescription. Various sources of collected data (e.g. data collected in routine practice) could be exploited for distinct goals. Obstacles remain before such data can be practically accessed and used, including technical issues. We present an easy-to-use software dedicated to innovative data analyses for practitioners and researchers. We provide 2 examples in forensics. Spe3dLab has been developed by 3 French teams: a bioinformatics laboratory (LaTIM), a private partner (Tekliko) and a department of forensic medicine (Jean Verdier Hospital). It was designed to be open source, relying on documented and maintained libraries, query-oriented and capable of handling the entire data process from capture to export of best predictive models for their integration in information systems. Spe3dLab was used for 2 specific forensics applications: i) the search for multiple causal factors and ii) the best predictive model of the functional impairment (total incapacity to work, TIW) of assault survivors. 2,892 patients were included over a 6-month period. Time to evaluation was the only direct cause identified for TIW, and victim category was an indirect cause. The specificity and sensitivity of the predictive model were 99.9% and 90%, respectively. Spe3dLab is a quick and efficient tool for accessing observational, routinely collected data and performing innovative analyses. Analyses can be exported for validation and routine use by practitioners, e.g., for computer-aided evaluation of complex problems. It can provide a fully integrated solution for individualized medicine. |
Tasks | |
Published | 2017-07-26 |
URL | http://arxiv.org/abs/1707.08454v1 |
http://arxiv.org/pdf/1707.08454v1.pdf | |
PWC | https://paperswithcode.com/paper/making-the-best-of-data-derived-from-a-daily |
Repo | |
Framework | |
Intercomparison of Machine Learning Methods for Statistical Downscaling: The Case of Daily and Extreme Precipitation
Title | Intercomparison of Machine Learning Methods for Statistical Downscaling: The Case of Daily and Extreme Precipitation |
Authors | Thomas Vandal, Evan Kodra, Auroop R Ganguly |
Abstract | Statistical downscaling of global climate models (GCMs) allows researchers to study local climate change effects decades into the future. A wide range of statistical models have been applied to downscaling GCMs but recent advances in machine learning have not been explored. In this paper, we compare four fundamental statistical methods, Bias Correction Spatial Disaggregation (BCSD), Ordinary Least Squares, Elastic-Net, and Support Vector Machine, with three more advanced machine learning methods, Multi-task Sparse Structure Learning (MSSL), BCSD coupled with MSSL, and Convolutional Neural Networks to downscale daily precipitation in the Northeast United States. Metrics to evaluate of each method’s ability to capture daily anomalies, large scale climate shifts, and extremes are analyzed. We find that linear methods, led by BCSD, consistently outperform non-linear approaches. The direct application of state-of-the-art machine learning methods to statistical downscaling does not provide improvements over simpler, longstanding approaches. |
Tasks | |
Published | 2017-02-13 |
URL | http://arxiv.org/abs/1702.04018v1 |
http://arxiv.org/pdf/1702.04018v1.pdf | |
PWC | https://paperswithcode.com/paper/intercomparison-of-machine-learning-methods |
Repo | |
Framework | |
An Integrated Platform for Live 3D Human Reconstruction and Motion Capturing
Title | An Integrated Platform for Live 3D Human Reconstruction and Motion Capturing |
Authors | Dimitrios S. Alexiadis, Anargyros Chatzitofis, Nikolaos Zioulis, Olga Zoidi, Georgios Louizis, Dimitrios Zarpalas, Petros Daras, Senior Member, IEEE |
Abstract | The latest developments in 3D capturing, processing, and rendering provide means to unlock novel 3D application pathways. The main elements of an integrated platform, which target tele-immersion and future 3D applications, are described in this paper, addressing the tasks of real-time capturing, robust 3D human shape/appearance reconstruction, and skeleton-based motion tracking. More specifically, initially, the details of a multiple RGB-depth (RGB-D) capturing system are given, along with a novel sensors’ calibration method. A robust, fast reconstruction method from multiple RGB-D streams is then proposed, based on an enhanced variation of the volumetric Fourier transform-based method, parallelized on the Graphics Processing Unit, and accompanied with an appropriate texture-mapping algorithm. On top of that, given the lack of relevant objective evaluation methods, a novel framework is proposed for the quantitative evaluation of real-time 3D reconstruction systems. Finally, a generic, multiple depth stream-based method for accurate real-time human skeleton tracking is proposed. Detailed experimental results with multi-Kinect2 data sets verify the validity of our arguments and the effectiveness of the proposed system and methodologies. |
Tasks | 3D Reconstruction, Calibration |
Published | 2017-12-08 |
URL | http://arxiv.org/abs/1712.03084v1 |
http://arxiv.org/pdf/1712.03084v1.pdf | |
PWC | https://paperswithcode.com/paper/an-integrated-platform-for-live-3d-human |
Repo | |
Framework | |
Fusion of Head and Full-Body Detectors for Multi-Object Tracking
Title | Fusion of Head and Full-Body Detectors for Multi-Object Tracking |
Authors | Roberto Henschel, Laura Leal-Taixé, Daniel Cremers, Bodo Rosenhahn |
Abstract | In order to track all persons in a scene, the tracking-by-detection paradigm has proven to be a very effective approach. Yet, relying solely on a single detector is also a major limitation, as useful image information might be ignored. Consequently, this work demonstrates how to fuse two detectors into a tracking system. To obtain the trajectories, we propose to formulate tracking as a weighted graph labeling problem, resulting in a binary quadratic program. As such problems are NP-hard, the solution can only be approximated. Based on the Frank-Wolfe algorithm, we present a new solver that is crucial to handle such difficult problems. Evaluation on pedestrian tracking is provided for multiple scenarios, showing superior results over single detector tracking and standard QP-solvers. Finally, our tracker ranks 2nd on the MOT16 benchmark and 1st on the new MOT17 benchmark, outperforming over 90 trackers. |
Tasks | Multi-Object Tracking, Object Tracking |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08314v4 |
http://arxiv.org/pdf/1705.08314v4.pdf | |
PWC | https://paperswithcode.com/paper/fusion-of-head-and-full-body-detectors-for |
Repo | |
Framework | |
Sub-committee Approval Voting and Generalised Justified Representation Axioms
Title | Sub-committee Approval Voting and Generalised Justified Representation Axioms |
Authors | Haris Aziz, Barton E. Lee |
Abstract | Social choice is replete with various settings including single-winner voting, multi-winner voting, probabilistic voting, multiple referenda, and public decision making. We study a general model of social choice called Sub-Committee Voting (SCV) that simultaneously generalizes these settings. We then focus on sub-committee voting with approvals and propose extensions of the justified representation axioms that have been considered for proportional representation in approval-based committee voting. We study the properties and relations of these axioms. For each of the axioms, we analyse whether a representative committee exists and also examine the complexity of computing and verifying such a committee. |
Tasks | Decision Making |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.06030v1 |
http://arxiv.org/pdf/1711.06030v1.pdf | |
PWC | https://paperswithcode.com/paper/sub-committee-approval-voting-and-generalised |
Repo | |
Framework | |
Stochastic Gradient Descent: Going As Fast As Possible But Not Faster
Title | Stochastic Gradient Descent: Going As Fast As Possible But Not Faster |
Authors | Alice Schoenauer-Sebag, Marc Schoenauer, Michèle Sebag |
Abstract | When applied to training deep neural networks, stochastic gradient descent (SGD) often incurs steady progression phases, interrupted by catastrophic episodes in which loss and gradient norm explode. A possible mitigation of such events is to slow down the learning process. This paper presents a novel approach to control the SGD learning rate, that uses two statistical tests. The first one, aimed at fast learning, compares the momentum of the normalized gradient vectors to that of random unit vectors and accordingly gracefully increases or decreases the learning rate. The second one is a change point detection test, aimed at the detection of catastrophic learning episodes; upon its triggering the learning rate is instantly halved. Both abilities of speeding up and slowing down the learning rate allows the proposed approach, called SALeRA, to learn as fast as possible but not faster. Experiments on standard benchmarks show that SALeRA performs well in practice, and compares favorably to the state of the art. |
Tasks | Change Point Detection |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.01427v1 |
http://arxiv.org/pdf/1709.01427v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-gradient-descent-going-as-fast-as |
Repo | |
Framework | |
Column normalization of a random measurement matrix
Title | Column normalization of a random measurement matrix |
Authors | Shahar Mendelson |
Abstract | In this note we answer a question of G. Lecu'{e}, by showing that column normalization of a random matrix with iid entries need not lead to good sparse recovery properties, even if the generating random variable has a reasonable moment growth. Specifically, for every $2 \leq p \leq c_1\log d$ we construct a random vector $X \in R^d$ with iid, mean-zero, variance $1$ coordinates, that satisfies $\sup_{t \in S^{d-1}} <X,t>_{L_q} \leq c_2\sqrt{q}$ for every $2\leq q \leq p$. We show that if $m \leq c_3\sqrt{p}d^{1/p}$ and $\tilde{\Gamma}:R^d \to R^m$ is the column-normalized matrix generated by $m$ independent copies of $X$, then with probability at least $1-2\exp(-c_4m)$, $\tilde{\Gamma}$ does not satisfy the exact reconstruction property of order $2$. |
Tasks | |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06278v1 |
http://arxiv.org/pdf/1702.06278v1.pdf | |
PWC | https://paperswithcode.com/paper/column-normalization-of-a-random-measurement |
Repo | |
Framework | |
End-to-End Unsupervised Deformable Image Registration with a Convolutional Neural Network
Title | End-to-End Unsupervised Deformable Image Registration with a Convolutional Neural Network |
Authors | Bob D. de Vos, Floris F. Berendsen, Max A. Viergever, Marius Staring, Ivana Išgum |
Abstract | In this work we propose a deep learning network for deformable image registration (DIRNet). The DIRNet consists of a convolutional neural network (ConvNet) regressor, a spatial transformer, and a resampler. The ConvNet analyzes a pair of fixed and moving images and outputs parameters for the spatial transformer, which generates the displacement vector field that enables the resampler to warp the moving image to the fixed image. The DIRNet is trained end-to-end by unsupervised optimization of a similarity metric between input image pairs. A trained DIRNet can be applied to perform registration on unseen image pairs in one pass, thus non-iteratively. Evaluation was performed with registration of images of handwritten digits (MNIST) and cardiac cine MR scans (Sunnybrook Cardiac Data). The results demonstrate that registration with DIRNet is as accurate as a conventional deformable image registration method with substantially shorter execution times. |
Tasks | Image Registration |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06065v1 |
http://arxiv.org/pdf/1704.06065v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-unsupervised-deformable-image |
Repo | |
Framework | |
A Geometric Framework for Stochastic Shape Analysis
Title | A Geometric Framework for Stochastic Shape Analysis |
Authors | Alexis Arnaudon, Darryl D. Holm, Stefan Sommer |
Abstract | We introduce a stochastic model of diffeomorphisms, whose action on a variety of data types descends to stochastic evolution of shapes, images and landmarks. The stochasticity is introduced in the vector field which transports the data in the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework for shape analysis and image registration. The stochasticity thereby models errors or uncertainties of the flow in following the prescribed deformation velocity. The approach is illustrated in the example of finite dimensional landmark manifolds, whose stochastic evolution is studied both via the Fokker-Planck equation and by numerical simulations. We derive two approaches for inferring parameters of the stochastic model from landmark configurations observed at discrete time points. The first of the two approaches matches moments of the Fokker-Planck equation to sample moments of the data, while the second approach employs an Expectation-Maximisation based algorithm using a Monte Carlo bridge sampling scheme to optimise the data likelihood. We derive and numerically test the ability of the two approaches to infer the spatial correlation length of the underlying noise. |
Tasks | Image Registration |
Published | 2017-03-29 |
URL | http://arxiv.org/abs/1703.09971v3 |
http://arxiv.org/pdf/1703.09971v3.pdf | |
PWC | https://paperswithcode.com/paper/a-geometric-framework-for-stochastic-shape |
Repo | |
Framework | |
A Lagrangian Gauss-Newton-Krylov Solver for Mass- and Intensity-Preserving Diffeomorphic Image Registration
Title | A Lagrangian Gauss-Newton-Krylov Solver for Mass- and Intensity-Preserving Diffeomorphic Image Registration |
Authors | Andreas Mang, Lars Ruthotto |
Abstract | We present an efficient solver for diffeomorphic image registration problems in the framework of Large Deformations Diffeomorphic Metric Mappings (LDDMM). We use an optimal control formulation, in which the velocity field of a hyperbolic PDE needs to be found such that the distance between the final state of the system (the transformed/transported template image) and the observation (the reference image) is minimized. Our solver supports both stationary and non-stationary (i.e., transient or time-dependent) velocity fields. As transformation models, we consider both the transport equation (assuming intensities are preserved during the deformation) and the continuity equation (assuming mass-preservation). We consider the reduced form of the optimal control problem and solve the resulting unconstrained optimization problem using a discretize-then-optimize approach. A key contribution is the elimination of the PDE constraint using a Lagrangian hyperbolic PDE solver. Lagrangian methods rely on the concept of characteristic curves that we approximate here using a fourth-order Runge-Kutta method. We also present an efficient algorithm for computing the derivatives of final state of the system with respect to the velocity field. This allows us to use fast Gauss-Newton based methods. We present quickly converging iterative linear solvers using spectral preconditioners that render the overall optimization efficient and scalable. Our method is embedded into the image registration framework FAIR and, thus, supports the most commonly used similarity measures and regularization functionals. We demonstrate the potential of our new approach using several synthetic and real world test problems with up to 14.7 million degrees of freedom. |
Tasks | Image Registration |
Published | 2017-03-13 |
URL | http://arxiv.org/abs/1703.04446v2 |
http://arxiv.org/pdf/1703.04446v2.pdf | |
PWC | https://paperswithcode.com/paper/a-lagrangian-gauss-newton-krylov-solver-for |
Repo | |
Framework | |