May 5, 2019

3161 words 15 mins read

Paper Group ANR 477

Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction. Reasoning about Body-Parts Relations for Sign Language Recognition. Machine Comprehension Based on Learning to Rank. Dis-S2V: Discourse Informed Sen2Vec. Make Up Your Mind: The Price of Online Queries in Differential Privacy. The formal-logical characterisation of lies, decep …

Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction


Title	Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction
Authors	Michael Waechter, Mate Beljan, Simon Fuhrmann, Nils Moehrle, Johannes Kopf, Michael Goesele
Abstract	The ultimate goal of many image-based modeling systems is to render photo-realistic novel views of a scene without visible artifacts. Existing evaluation metrics and benchmarks focus mainly on the geometric accuracy of the reconstructed model, which is, however, a poor predictor of visual accuracy. Furthermore, using only geometric accuracy by itself does not allow evaluating systems that either lack a geometric scene representation or utilize coarse proxy geometry. Examples include light field or image-based rendering systems. We propose a unified evaluation approach based on novel view prediction error that is able to analyze the visual quality of any method that can render novel views from input images. One of the key advantages of this approach is that it does not require ground truth geometry. This dramatically simplifies the creation of test datasets and benchmarks. It also allows us to evaluate the quality of an unknown scene during the acquisition and reconstruction process, which is useful for acquisition planning. We evaluate our approach on a range of methods including standard geometry-plus-texture pipelines as well as image-based rendering techniques, compare it to existing geometry-based benchmarks, and demonstrate its utility for a range of use cases.
Tasks	3D Reconstruction
Published	2016-01-26
URL	http://arxiv.org/abs/1601.06950v1
PDF	http://arxiv.org/pdf/1601.06950v1.pdf
PWC	https://paperswithcode.com/paper/virtual-rephotography-novel-view-prediction
Repo
Framework

Reasoning about Body-Parts Relations for Sign Language Recognition


Title	Reasoning about Body-Parts Relations for Sign Language Recognition
Authors	Marc Martínez-Camarena, Jose Oramas, Mario Montagud-Climent, Tinne Tuytelaars
Abstract	Over the years, hand gesture recognition has been mostly addressed considering hand trajectories in isolation. However, in most sign languages, hand gestures are defined on a particular context (body region). We propose a pipeline to perform sign language recognition which models hand movements in the context of other parts of the body captured in the 3D space using the MS Kinect sensor. In addition, we perform sign recognition based on the different hand postures that occur during a sign. Our experiments show that considering different body parts brings improved performance when compared to other methods which only consider global hand trajectories. Finally, we demonstrate that the combination of hand postures features with hand gestures features helps to improve the prediction of a given sign.
Tasks	Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition, Sign Language Recognition
Published	2016-07-21
URL	http://arxiv.org/abs/1607.06356v1
PDF	http://arxiv.org/pdf/1607.06356v1.pdf
PWC	https://paperswithcode.com/paper/reasoning-about-body-parts-relations-for-sign
Repo
Framework

Machine Comprehension Based on Learning to Rank


Title	Machine Comprehension Based on Learning to Rank
Authors	Tian Tian, Yuezhang Li
Abstract	Machine comprehension plays an essential role in NLP and has been widely explored with dataset like MCTest. However, this dataset is too simple and too small for learning true reasoning abilities. \cite{hermann2015teaching} therefore release a large scale news article dataset and propose a deep LSTM reader system for machine comprehension. However, the training process is expensive. We therefore try feature-engineered approach with semantics on the new dataset to see how traditional machine learning technique and semantics can help with machine comprehension. Meanwhile, our proposed L2R reader system achieves good performance with efficiency and less training data.
Tasks	Learning-To-Rank, Reading Comprehension
Published	2016-05-11
URL	http://arxiv.org/abs/1605.03284v2
PDF	http://arxiv.org/pdf/1605.03284v2.pdf
PWC	https://paperswithcode.com/paper/machine-comprehension-based-on-learning-to
Repo
Framework

Dis-S2V: Discourse Informed Sen2Vec


Title	Dis-S2V: Discourse Informed Sen2Vec
Authors	Tanay Kumar Saha, Shafiq Joty, Naeemul Hassan, Mohammad Al Hasan
Abstract	Vector representation of sentences is important for many text processing tasks that involve clustering, classifying, or ranking sentences. Recently, distributed representation of sentences learned by neural models from unlabeled data has been shown to outperform the traditional bag-of-words representation. However, most of these learning methods consider only the content of a sentence and disregard the relations among sentences in a discourse by and large. In this paper, we propose a series of novel models for learning latent representations of sentences (Sen2Vec) that consider the content of a sentence as well as inter-sentence relations. We first represent the inter-sentence relations with a language network and then use the network to induce contextual information into the content-based Sen2Vec models. Two different approaches are introduced to exploit the information in the network. Our first approach retrofits (already trained) Sen2Vec vectors with respect to the network in two different ways: (1) using the adjacency relations of a node, and (2) using a stochastic sampling method which is more flexible in sampling neighbors of a node. The second approach uses a regularizer to encode the information in the network into the existing Sen2Vec model. Experimental results show that our proposed models outperform existing methods in three fundamental information system tasks demonstrating the effectiveness of our approach. The models leverage the computational power of multi-core CPUs to achieve fine-grained computational efficiency. We make our code publicly available upon acceptance.
Tasks
Published	2016-10-25
URL	http://arxiv.org/abs/1610.08078v1
PDF	http://arxiv.org/pdf/1610.08078v1.pdf
PWC	https://paperswithcode.com/paper/dis-s2v-discourse-informed-sen2vec
Repo
Framework

Make Up Your Mind: The Price of Online Queries in Differential Privacy


Title	Make Up Your Mind: The Price of Online Queries in Differential Privacy
Authors	Mark Bun, Thomas Steinke, Jonathan Ullman
Abstract	We consider the problem of answering queries about a sensitive dataset subject to differential privacy. The queries may be chosen adversarially from a larger set Q of allowable queries in one of three ways, which we list in order from easiest to hardest to answer: Offline: The queries are chosen all at once and the differentially private mechanism answers the queries in a single batch. Online: The queries are chosen all at once, but the mechanism only receives the queries in a streaming fashion and must answer each query before seeing the next query. Adaptive: The queries are chosen one at a time and the mechanism must answer each query before the next query is chosen. In particular, each query may depend on the answers given to previous queries. Many differentially private mechanisms are just as efficient in the adaptive model as they are in the offline model. Meanwhile, most lower bounds for differential privacy hold in the offline setting. This suggests that the three models may be equivalent. We prove that these models are all, in fact, distinct. Specifically, we show that there is a family of statistical queries such that exponentially more queries from this family can be answered in the offline model than in the online model. We also exhibit a family of search queries such that exponentially more queries from this family can be answered in the online model than in the adaptive model. We also investigate whether such separations might hold for simple queries like threshold queries over the real line.
Tasks
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04618v1
PDF	http://arxiv.org/pdf/1604.04618v1.pdf
PWC	https://paperswithcode.com/paper/make-up-your-mind-the-price-of-online-queries
Repo
Framework

The formal-logical characterisation of lies, deception, and associated notions


Title	The formal-logical characterisation of lies, deception, and associated notions
Authors	Toni Heidenreich
Abstract	Defining various dishonest notions in a formal way is a key step to enable intelligent agents to act in untrustworthy environments. This review evaluates the literature for this topic by looking at formal definitions based on modal logic as well as other formal approaches. Criteria from philosophical groundwork is used to assess the definitions for correctness and completeness. The key contribution of this review is to show that only a few definitions fully comply with this gold standard and to point out the missing steps towards a successful application of these definitions in an actual agent environment.
Tasks
Published	2016-12-28
URL	http://arxiv.org/abs/1612.08845v1
PDF	http://arxiv.org/pdf/1612.08845v1.pdf
PWC	https://paperswithcode.com/paper/the-formal-logical-characterisation-of-lies
Repo
Framework

Shallow Networks for High-Accuracy Road Object-Detection


Title	Shallow Networks for High-Accuracy Road Object-Detection
Authors	Khalid Ashraf, Bichen Wu, Forrest N. Iandola, Mattthew W. Moskewicz, Kurt Keutzer
Abstract	The ability to automatically detect other vehicles on the road is vital to the safety of partially-autonomous and fully-autonomous vehicles. Most of the high-accuracy techniques for this task are based on R-CNN or one of its faster variants. In the research community, much emphasis has been applied to using 3D vision or complex R-CNN variants to achieve higher accuracy. However, are there more straightforward modifications that could deliver higher accuracy? Yes. We show that increasing input image resolution (i.e. upsampling) offers up to 12 percentage-points higher accuracy compared to an off-the-shelf baseline. We also find situations where earlier/shallower layers of CNN provide higher accuracy than later/deeper layers. We further show that shallow models and upsampled images yield competitive accuracy. Our findings contrast with the current trend towards deeper and larger models to achieve high accuracy in domain specific detection tasks.
Tasks	Autonomous Vehicles, Object Detection
Published	2016-06-05
URL	http://arxiv.org/abs/1606.01561v1
PDF	http://arxiv.org/pdf/1606.01561v1.pdf
PWC	https://paperswithcode.com/paper/shallow-networks-for-high-accuracy-road
Repo
Framework

Random Forests versus Neural Networks - What’s Best for Camera Localization?


Title	Random Forests versus Neural Networks - What’s Best for Camera Localization?
Authors	Daniela Massiceti, Alexander Krull, Eric Brachmann, Carsten Rother, Philip H. S. Torr
Abstract	This work addresses the task of camera localization in a known 3D scene given a single input RGB image. State-of-the-art approaches accomplish this in two steps: firstly, regressing for every pixel in the image its 3D scene coordinate and subsequently, using these coordinates to estimate the final 6D camera pose via RANSAC. To solve the first step, Random Forests (RFs) are typically used. On the other hand, Neural Networks (NNs) reign in many dense regression tasks, but are not test-time efficient. We ask the question: which of the two is best for camera localization? To address this, we make two method contributions: (1) a test-time efficient NN architecture which we term a ForestNet that is derived and initialized from a RF, and (2) a new fully-differentiable robust averaging technique for regression ensembles which can be trained end-to-end with a NN. Our experimental findings show that for scene coordinate regression, traditional NN architectures are superior to test-time efficient RFs and ForestNets, however, this does not translate to final 6D camera pose accuracy where RFs and ForestNets perform slightly better. To summarize, our best method, a ForestNet with a robust average, which has an equivalent fast and lightweight RF, improves over the state-of-the-art for camera localization on the 7-Scenes dataset. While this work focuses on scene coordinate regression for camera localization, our innovations may also be applied to other continuous regression tasks.
Tasks	Camera Localization
Published	2016-09-19
URL	http://arxiv.org/abs/1609.05797v3
PDF	http://arxiv.org/pdf/1609.05797v3.pdf
PWC	https://paperswithcode.com/paper/random-forests-versus-neural-networks-whats
Repo
Framework

Mapping and Localization from Planar Markers


Title	Mapping and Localization from Planar Markers
Authors	Rafael Muñoz-Salinas, Manuel J. Marín-Jimenez, Enrique Yeguas-Bolivar, Rafael Medina-Carnicer
Abstract	Squared planar markers are a popular tool for fast, accurate and robust camera localization, but its use is frequently limited to a single marker, or at most, to a small set of them for which their relative pose is known beforehand. Mapping and localization from a large set of planar markers is yet a scarcely treated problem in favour of keypoint-based approaches. However, while keypoint detectors are not robust to rapid motion, large changes in viewpoint, or significant changes in appearance, fiducial markers can be robustly detected under a wider range of conditions. This paper proposes a novel method to simultaneously solve the problems of mapping and localization from a set of squared planar markers. First, a quiver of pairwise relative marker poses is created, from which an initial pose graph is obtained. The pose graph may contain small pairwise pose errors, that when propagated, leads to large errors. Thus, we distribute the rotational and translational error along the basis cycles of the graph so as to obtain a corrected pose graph. Finally, we perform a global pose optimization by minimizing the reprojection errors of the planar markers in all observed frames. The experiments conducted show that our method performs better than Structure from Motion and visual SLAM techniques.
Tasks	Camera Localization
Published	2016-06-01
URL	http://arxiv.org/abs/1606.00151v2
PDF	http://arxiv.org/pdf/1606.00151v2.pdf
PWC	https://paperswithcode.com/paper/mapping-and-localization-from-planar-markers
Repo
Framework

Domain Adaptation For Formant Estimation Using Deep Learning


Title	Domain Adaptation For Formant Estimation Using Deep Learning
Authors	Yehoshua Dissen, Joseph Keshet, Jacob Goldberger, Cynthia Clopper
Abstract	In this paper we present a domain adaptation technique for formant estimation using a deep network. We first train a deep learning network on a small read speech dataset. We then freeze the parameters of the trained network and use several different datasets to train an adaptation layer that makes the obtained network universal in the sense that it works well for a variety of speakers and speech domains with very different characteristics. We evaluated our adapted network on three datasets, each of which has different speaker characteristics and speech styles. The performance of our method compares favorably with alternative methods for formant estimation.
Tasks	Domain Adaptation
Published	2016-11-06
URL	http://arxiv.org/abs/1611.01783v1
PDF	http://arxiv.org/pdf/1611.01783v1.pdf
PWC	https://paperswithcode.com/paper/domain-adaptation-for-formant-estimation
Repo
Framework

Aboveground biomass mapping in French Guiana by combining remote sensing, forest inventories and environmental data


Title	Aboveground biomass mapping in French Guiana by combining remote sensing, forest inventories and environmental data
Authors	Ibrahim Fayad, Nicolas Baghdadi, Stéphane Guitet, Jean-Stéphane Bailly, Bruno Hérault, Valéry Gond, Mahmoud Hajj, Dinh Ho Tong Minh
Abstract	Mapping forest aboveground biomass (AGB) has become an important task, particularly for the reporting of carbon stocks and changes. AGB can be mapped using synthetic aperture radar data (SAR) or passive optical data. However, these data are insensitive to high AGB levels (\textgreater{}150 Mg/ha, and \textgreater{}300 Mg/ha for P-band), which are commonly found in tropical forests. Studies have mapped the rough variations in AGB by combining optical and environmental data at regional and global scales. Nevertheless, these maps cannot represent local variations in AGB in tropical forests. In this paper, we hypothesize that the problem of misrepresenting local variations in AGB and AGB estimation with good precision occurs because of both methodological limits (signal saturation or dilution bias) and a lack of adequate calibration data in this range of AGB values. We test this hypothesis by developing a calibrated regression model to predict variations in high AGB values (mean \textgreater{}300 Mg/ha) in French Guiana by a methodological approach for spatial extrapolation with data from the optical geoscience laser altimeter system (GLAS), forest inventories, radar, optics, and environmental variables for spatial inter-and extrapolation. Given their higher point count, GLAS data allow a wider coverage of AGB values. We find that the metrics from GLAS footprints are correlated with field AGB estimations (R 2 =0.54, RMSE=48.3 Mg/ha) with no bias for high values. First, predictive models, including remote-sensing, environmental variables and spatial correlation functions, allow us to obtain “wall-to-wall” AGB maps over French Guiana with an RMSE for the in situ AGB estimates of ~51 Mg/ha and R${}^2$=0.48 at a 1-km grid size. We conclude that a calibrated regression model based on GLAS with dependent environmental data can produce good AGB predictions even for high AGB values if the calibration data fit the AGB range. We also demonstrate that small temporal and spatial mismatches between field data and GLAS footprints are not a problem for regional and global calibrated regression models because field data aim to predict large and deep tendencies in AGB variations from environmental gradients and do not aim to represent high but stochastic and temporally limited variations from forest dynamics. Thus, we advocate including a greater variety of data, even if less precise and shifted, to better represent high AGB values in global models and to improve the fitting of these models for high values.
Tasks	Calibration
Published	2016-10-14
URL	http://arxiv.org/abs/1610.04371v1
PDF	http://arxiv.org/pdf/1610.04371v1.pdf
PWC	https://paperswithcode.com/paper/aboveground-biomass-mapping-in-french-guiana
Repo
Framework

Clinical Tagging with Joint Probabilistic Models


Title	Clinical Tagging with Joint Probabilistic Models
Authors	Yoni Halpern, Steven Horng, David Sontag
Abstract	We describe a method for parameter estimation in bipartite probabilistic graphical models for joint prediction of clinical conditions from the electronic medical record. The method does not rely on the availability of gold-standard labels, but rather uses noisy labels, called anchors, for learning. We provide a likelihood-based objective and a moments-based initialization that are effective at learning the model parameters. The learned model is evaluated in a task of assigning a heldout clinical condition to patients based on retrospective analysis of the records, and outperforms baselines which do not account for the noisiness in the labels or do not model the conditions jointly.
Tasks
Published	2016-08-02
URL	http://arxiv.org/abs/1608.00686v3
PDF	http://arxiv.org/pdf/1608.00686v3.pdf
PWC	https://paperswithcode.com/paper/clinical-tagging-with-joint-probabilistic
Repo
Framework

Directional Mean Curvature for Textured Image Demixing


Title	Directional Mean Curvature for Textured Image Demixing
Authors	Duy Hoang Thai, David Banks
Abstract	Approximation theory plays an important role in image processing, especially image deconvolution and decomposition. For piecewise smooth images, there are many methods that have been developed over the past thirty years. The goal of this study is to devise similar and practical methodology for handling textured images. This problem is motivated by forensic imaging, since fingerprints, shoeprints and bullet ballistic evidence are textured images. In particular, it is known that texture information is almost destroyed by a blur operator, such as a blurred ballistic image captured from a low-cost microscope. The contribution of this work is twofold: first, we propose a mathematical model for textured image deconvolution and decomposition into four meaningful components, using a high-order partial differential equation approach based on the directional mean curvature. Second, we uncover a link between functional analysis and multiscale sampling theory, e.g., harmonic analysis and filter banks. Both theoretical results and examples with natural images are provided to illustrate the performance of the proposed model.
Tasks	Image Deconvolution
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08625v1
PDF	http://arxiv.org/pdf/1611.08625v1.pdf
PWC	https://paperswithcode.com/paper/directional-mean-curvature-for-textured-image
Repo
Framework

Predicting Process Behaviour using Deep Learning


Title	Predicting Process Behaviour using Deep Learning
Authors	Joerg Evermann, Jana-Rebecca Rehse, Peter Fettke
Abstract	Predicting business process behaviour is an important aspect of business process management. Motivated by research in natural language processing, this paper describes an application of deep learning with recurrent neural networks to the problem of predicting the next event in a business process. This is both a novel method in process prediction, which has largely relied on explicit process models, and also a novel application of deep learning methods. The approach is evaluated on two real datasets and our results surpass the state-of-the-art in prediction precision.
Tasks
Published	2016-12-14
URL	http://arxiv.org/abs/1612.04600v2
PDF	http://arxiv.org/pdf/1612.04600v2.pdf
PWC	https://paperswithcode.com/paper/predicting-process-behaviour-using-deep
Repo
Framework

Query Expansion with Locally-Trained Word Embeddings


Title	Query Expansion with Locally-Trained Word Embeddings
Authors	Fernando Diaz, Bhaskar Mitra, Nick Craswell
Abstract	Continuous space word embeddings have received a great deal of attention in the natural language processing and machine learning communities for their ability to model term similarity and other relationships. We study the use of term relatedness in the context of query expansion for ad hoc information retrieval. We demonstrate that word embeddings such as word2vec and GloVe, when trained globally, underperform corpus and query specific embeddings for retrieval tasks. These results suggest that other tasks benefiting from global embeddings may also benefit from local embeddings.
Tasks	Ad-Hoc Information Retrieval, Information Retrieval, Word Embeddings
Published	2016-05-25
URL	http://arxiv.org/abs/1605.07891v2
PDF	http://arxiv.org/pdf/1605.07891v2.pdf
PWC	https://paperswithcode.com/paper/query-expansion-with-locally-trained-word
Repo
Framework