July 28, 2019

2814 words 14 mins read

Paper Group ANR 459

Paper Group ANR 459

Tensor Regression Meets Gaussian Processes. Local Jet Pattern: A Robust Descriptor for Texture Classification. Using experimental game theory to transit human values to ethical AI. Non-FPT lower bounds for structural restrictions of decision DNNF. UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning. Monocular Visual Odometry for …

Tensor Regression Meets Gaussian Processes

Title Tensor Regression Meets Gaussian Processes
Authors Rose Yu, Guangyu Li, Yan Liu
Abstract Low-rank tensor regression, a new model class that learns high-order correlation from data, has recently received considerable attention. At the same time, Gaussian processes (GP) are well-studied machine learning models for structure learning. In this paper, we demonstrate interesting connections between the two, especially for multi-way data analysis. We show that low-rank tensor regression is essentially learning a multi-linear kernel in Gaussian processes, and the low-rank assumption translates to the constrained Bayesian inference problem. We prove the oracle inequality and derive the average case learning curve for the equivalent GP model. Our finding implies that low-rank tensor regression, though empirically successful, is highly dependent on the eigenvalues of covariance functions as well as variable correlations.
Tasks Bayesian Inference, Gaussian Processes
Published 2017-10-31
URL http://arxiv.org/abs/1710.11345v1
PDF http://arxiv.org/pdf/1710.11345v1.pdf
PWC https://paperswithcode.com/paper/tensor-regression-meets-gaussian-processes
Repo
Framework

Local Jet Pattern: A Robust Descriptor for Texture Classification

Title Local Jet Pattern: A Robust Descriptor for Texture Classification
Authors Swalpa Kumar Roy, Bhabatosh Chanda, Bidyut B. Chaudhuri, Dipak Kumar Ghosh, Shiv Ram Dubey
Abstract Methods based on local image features have recently shown promise for texture classification tasks, especially in the presence of large intra-class variation due to illumination, scale, and viewpoint changes. Inspired by the theories of image structure analysis, this paper presents a simple, efficient, yet robust descriptor namely local jet pattern (LJP) for texture classification. In this approach, a jet space representation of a texture image is derived from a set of derivatives of Gaussian (DtGs) filter responses up to second order, so called local jet vectors (LJV), which also satisfy the Scale Space properties. The LJP is obtained by utilizing the relationship of center pixel with the local neighborhood information in jet space. Finally, the feature vector of a texture region is formed by concatenating the histogram of LJP for all elements of LJV. All DtGs responses up to second order together preserves the intrinsic local image structure, and achieves invariance to scale, rotation, and reflection. This allows us to develop a texture classification framework which is discriminative and robust. Extensive experiments on five standard texture image databases, employing nearest subspace classifier (NSC), the proposed descriptor achieves 100%, 99.92%, 99.75%, 99.16%, and 99.65% accuracy for Outex_TC-00010 (Outex_TC10), and Outex_TC-00012 (Outex_TC12), KTH-TIPS, Brodatz, CUReT, respectively, which are outperforms the state-of-the-art methods.
Tasks Texture Classification
Published 2017-11-26
URL http://arxiv.org/abs/1711.10921v3
PDF http://arxiv.org/pdf/1711.10921v3.pdf
PWC https://paperswithcode.com/paper/local-jet-pattern-a-robust-descriptor-for
Repo
Framework

Using experimental game theory to transit human values to ethical AI

Title Using experimental game theory to transit human values to ethical AI
Authors Yijia Wang, Yan Wan, Zhijian Wang
Abstract Knowing the reflection of game theory and ethics, we develop a mathematical representation to bridge the gap between the concepts in moral philosophy (e.g., Kantian and Utilitarian) and AI ethics industry technology standard (e.g., IEEE P7000 standard series for Ethical AI). As an application, we demonstrate how human value can be obtained from the experimental game theory (e.g., trust game experiment) so as to build an ethical AI. Moreover, an approach to test the ethics (rightness or wrongness) of a given AI algorithm by using an iterated Prisoner’s Dilemma Game experiment is discussed as an example. Compared with existing mathematical frameworks and testing method on AI ethics technology, the advantages of the proposed approach are analyzed.
Tasks
Published 2017-11-16
URL http://arxiv.org/abs/1711.05905v1
PDF http://arxiv.org/pdf/1711.05905v1.pdf
PWC https://paperswithcode.com/paper/using-experimental-game-theory-to-transit
Repo
Framework

Non-FPT lower bounds for structural restrictions of decision DNNF

Title Non-FPT lower bounds for structural restrictions of decision DNNF
Authors Andrea Calì, Florent Capelli, Igor Razgon
Abstract We give a non-FPT lower bound on the size of structured decision DNNF and OBDD with decomposable AND-nodes representing CNF-formulas of bounded incidence treewidth. Both models are known to be of FPT size for CNFs of bounded primal treewidth. To the best of our knowledge this is the first parameterized separation of primal treewidth and incidence treewidth for knowledge compilation models.
Tasks
Published 2017-08-25
URL http://arxiv.org/abs/1708.07767v1
PDF http://arxiv.org/pdf/1708.07767v1.pdf
PWC https://paperswithcode.com/paper/non-fpt-lower-bounds-for-structural
Repo
Framework

UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning

Title UnDeepVO: Monocular Visual Odometry through Unsupervised Deep Learning
Authors Ruihao Li, Sen Wang, Zhiqiang Long, Dongbing Gu
Abstract We propose a novel monocular visual odometry (VO) system called UnDeepVO in this paper. UnDeepVO is able to estimate the 6-DoF pose of a monocular camera and the depth of its view by using deep neural networks. There are two salient features of the proposed UnDeepVO: one is the unsupervised deep learning scheme, and the other is the absolute scale recovery. Specifically, we train UnDeepVO by using stereo image pairs to recover the scale but test it by using consecutive monocular images. Thus, UnDeepVO is a monocular system. The loss function defined for training the networks is based on spatial and temporal dense information. A system overview is shown in Fig. 1. The experiments on KITTI dataset show our UnDeepVO achieves good performance in terms of pose accuracy.
Tasks Monocular Visual Odometry, Visual Odometry
Published 2017-09-20
URL http://arxiv.org/abs/1709.06841v2
PDF http://arxiv.org/pdf/1709.06841v2.pdf
PWC https://paperswithcode.com/paper/undeepvo-monocular-visual-odometry-through
Repo
Framework

Monocular Visual Odometry for an Unmanned Sea-Surface Vehicle

Title Monocular Visual Odometry for an Unmanned Sea-Surface Vehicle
Authors George Terzakis, Riccardo Polvara, Sanjay Sharma, Phil Culverhouse, Robert Sutton
Abstract We tackle the problem of localizing an autonomous sea-surface vehicle in river estuarine areas using monocular camera and angular velocity input from an inertial sensor. Our method is challenged by two prominent drawbacks associated with the environment, which are typically not present in standard visual simultaneous localization and mapping (SLAM) applications on land (or air): a) Scene depth varies significantly (from a few meters to several kilometers) and, b) In conjunction to the latter, there exists no ground plane to provide features with enough disparity based on which to reliably detect motion. To that end, we use the IMU orientation feedback in order to re-cast the problem of visual localization without the mapping component, although the map can be implicitly obtained from the camera pose estimates. We find that our method produces reliable odometry estimates for trajectories several hundred meters long in the water. To compare the visual odometry estimates with GPS based ground truth, we interpolate the trajectory with splines on a common parameter and obtain position error in meters recovering an optimal affine transformation between the two splines.
Tasks Monocular Visual Odometry, Simultaneous Localization and Mapping, Visual Localization, Visual Odometry
Published 2017-07-14
URL http://arxiv.org/abs/1707.04444v2
PDF http://arxiv.org/pdf/1707.04444v2.pdf
PWC https://paperswithcode.com/paper/monocular-visual-odometry-for-an-unmanned-sea
Repo
Framework

Fully Automatic Segmentation and Objective Assessment of Atrial Scars for Longstanding Persistent Atrial Fibrillation Patients Using Late Gadolinium-Enhanced MRI

Title Fully Automatic Segmentation and Objective Assessment of Atrial Scars for Longstanding Persistent Atrial Fibrillation Patients Using Late Gadolinium-Enhanced MRI
Authors Guang Yang, Xiahai Zhuang, Habib Khan, Shouvik Haldar, Eva Nyktari, Lei Li, Rick Wage, Xujiong Ye, Greg Slabaugh, Raad Mohiaddin, Tom Wong, Jennifer Keegan, David Firmin
Abstract Purpose: Atrial fibrillation (AF) is the most common cardiac arrhythmia and is correlated with increased morbidity and mortality. It is associated with atrial fibrosis, which may be assessed non-invasively using late gadolinium-enhanced (LGE) magnetic resonance imaging (MRI) where scar tissue is visualised as a region of signal enhancement. In this study, we proposed a novel fully automatic pipeline to achieve an accurate and objective atrial scarring segmentation and assessment of LGE MRI scans for the AF patients. Methods: Our fully automatic pipeline uniquely combined: (1) a multi-atlas based whole heart segmentation (MA-WHS) to determine the cardiac anatomy from an MRI Roadmap acquisition which is then mapped to LGE MRI, and (2) a super-pixel and supervised learning based approach to delineate the distribution and extent of atrial scarring in LGE MRI. Results: Both our MA-WHS and atrial scarring segmentation showed accurate delineations of cardiac anatomy (mean Dice = 89%) and atrial scarring (mean Dice =79%) respectively compared to the established ground truth from manual segmentation. Compared with previously studied methods with manual interventions, our innovative pipeline demonstrated comparable results, but was computed fully automatically. Conclusion: The proposed segmentation methods allow LGE MRI to be used as an objective assessment tool for localisation, visualisation and quantification of atrial scarring.
Tasks
Published 2017-05-26
URL http://arxiv.org/abs/1705.09529v1
PDF http://arxiv.org/pdf/1705.09529v1.pdf
PWC https://paperswithcode.com/paper/fully-automatic-segmentation-and-objective
Repo
Framework

A Data-Driven Sparse-Learning Approach to Model Reduction in Chemical Reaction Networks

Title A Data-Driven Sparse-Learning Approach to Model Reduction in Chemical Reaction Networks
Authors Farshad Harirchi, Omar A. Khalil, Sijia Liu, Paolo Elvati, Angela Violi, Alfred O. Hero
Abstract In this paper, we propose an optimization-based sparse learning approach to identify the set of most influential reactions in a chemical reaction network. This reduced set of reactions is then employed to construct a reduced chemical reaction mechanism, which is relevant to chemical interaction network modeling. The problem of identifying influential reactions is first formulated as a mixed-integer quadratic program, and then a relaxation method is leveraged to reduce the computational complexity of our approach. Qualitative and quantitative validation of the sparse encoding approach demonstrates that the model captures important network structural properties with moderate computational load.
Tasks Sparse Learning
Published 2017-12-12
URL http://arxiv.org/abs/1712.04493v1
PDF http://arxiv.org/pdf/1712.04493v1.pdf
PWC https://paperswithcode.com/paper/a-data-driven-sparse-learning-approach-to
Repo
Framework

ME R-CNN: Multi-Expert R-CNN for Object Detection

Title ME R-CNN: Multi-Expert R-CNN for Object Detection
Authors Hyungtae Lee, Sungmin Eum, Heesung Kwon
Abstract We introduce Multi-Expert Region-based CNN (ME R-CNN) which is equipped with multiple experts and built on top of the R-CNN framework known to be one of the state-of-the-art object detection methods. ME R-CNN focuses in better capturing the appearance variations caused by different shapes, poses, and viewing angles. The proposed approach consists of three experts each responsible for objects with particular shapes: horizontally elongated, square-like, and vertically elongated. On top of using selective search which provides a compact, yet effective set of region of interests (RoIs) for object detection, we augmented the set by also employing the exhaustive search for training only. Incorporating the exhaustive search can provide complementary advantages: i) it captures the multitude of neighboring RoIs missed by the selective search, and thus ii) provide significantly larger amount of training examples. We show that the ME R-CNN architecture provides considerable performance increase over the baselines on PASCAL VOC 07, 12, and MS COCO datasets.
Tasks Object Detection
Published 2017-04-04
URL http://arxiv.org/abs/1704.01069v2
PDF http://arxiv.org/pdf/1704.01069v2.pdf
PWC https://paperswithcode.com/paper/me-r-cnn-multi-expert-r-cnn-for-object
Repo
Framework

Online and Distributed Robust Regressions under Adversarial Data Corruption

Title Online and Distributed Robust Regressions under Adversarial Data Corruption
Authors Xuchao Zhang, Liang Zhao, Arnold P. Boedihardjo, Chang-Tien Lu
Abstract In today’s era of big data, robust least-squares regression becomes a more challenging problem when considering the adversarial corruption along with explosive growth of datasets. Traditional robust methods can handle the noise but suffer from several challenges when applied in huge dataset including 1) computational infeasibility of handling an entire dataset at once, 2) existence of heterogeneously distributed corruption, and 3) difficulty in corruption estimation when data cannot be entirely loaded. This paper proposes online and distributed robust regression approaches, both of which can concurrently address all the above challenges. Specifically, the distributed algorithm optimizes the regression coefficients of each data block via heuristic hard thresholding and combines all the estimates in a distributed robust consolidation. Furthermore, an online version of the distributed algorithm is proposed to incrementally update the existing estimates with new incoming data. We also prove that our algorithms benefit from strong robustness guarantees in terms of regression coefficient recovery with a constant upper bound on the error of state-of-the-art batch methods. Extensive experiments on synthetic and real datasets demonstrate that our approaches are superior to those of existing methods in effectiveness, with competitive efficiency.
Tasks
Published 2017-10-02
URL http://arxiv.org/abs/1710.00904v1
PDF http://arxiv.org/pdf/1710.00904v1.pdf
PWC https://paperswithcode.com/paper/online-and-distributed-robust-regressions
Repo
Framework

CUNI System for the WMT17 Multimodal Translation Task

Title CUNI System for the WMT17 Multimodal Translation Task
Authors Jindřich Helcl, Jindřich Libovický
Abstract In this paper, we describe our submissions to the WMT17 Multimodal Translation Task. For Task 1 (multimodal translation), our best scoring system is a purely textual neural translation of the source image caption to the target language. The main feature of the system is the use of additional data that was acquired by selecting similar sentences from parallel corpora and by data synthesis with back-translation. For Task 2 (cross-lingual image captioning), our best submitted system generates an English caption which is then translated by the best system used in Task 1. We also present negative results, which are based on ideas that we believe have potential of making improvements, but did not prove to be useful in our particular setup.
Tasks Image Captioning
Published 2017-07-14
URL http://arxiv.org/abs/1707.04550v1
PDF http://arxiv.org/pdf/1707.04550v1.pdf
PWC https://paperswithcode.com/paper/cuni-system-for-the-wmt17-multimodal-1
Repo
Framework

End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech

Title End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech
Authors Hai X. Pham, Yuting Wang, Vladimir Pavlovic
Abstract We present a deep learning framework for real-time speech-driven 3D facial animation from just raw waveforms. Our deep neural network directly maps an input sequence of speech audio to a series of micro facial action unit activations and head rotations to drive a 3D blendshape face model. In particular, our deep model is able to learn the latent representations of time-varying contextual information and affective states within the speech. Hence, our model not only activates appropriate facial action units at inference to depict different utterance generating actions, in the form of lip movements, but also, without any assumption, automatically estimates emotional intensity of the speaker and reproduces her ever-changing affective states by adjusting strength of facial unit activations. For example, in a happy speech, the mouth opens wider than normal, while other facial units are relaxed; or in a surprised state, both eyebrows raise higher. Experiments on a diverse audiovisual corpus of different actors across a wide range of emotional states show interesting and promising results of our approach. Being speaker-independent, our generalized model is readily applicable to various tasks in human-machine interaction and animation.
Tasks
Published 2017-10-02
URL http://arxiv.org/abs/1710.00920v2
PDF http://arxiv.org/pdf/1710.00920v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-learning-for-3d-facial-animation
Repo
Framework

Honey Bee Dance Modeling in Real-time using Machine Learning

Title Honey Bee Dance Modeling in Real-time using Machine Learning
Authors Abolfazl Saghafi, Chris P. Tsokos
Abstract The waggle dance that honeybees perform is an astonishing way of communicating the location of food source. After over 60 years of its discovery, researchers still use manual labeling by watching hours of dance videos to detect different transitions between dance components thus extracting information regarding the distance and direction to the food source. We propose an automated process to monitor and segment different components of honeybee waggle dance. The process is highly accurate, runs in real-time, and can use shared information between multiple dances.
Tasks
Published 2017-05-20
URL http://arxiv.org/abs/1705.07362v1
PDF http://arxiv.org/pdf/1705.07362v1.pdf
PWC https://paperswithcode.com/paper/honey-bee-dance-modeling-in-real-time-using
Repo
Framework

Phase-Shifting Separable Haar Wavelets and Applications

Title Phase-Shifting Separable Haar Wavelets and Applications
Authors Mais Alnasser, Hassan Foroosh
Abstract This paper presents a new approach for tackling the shift-invariance problem in the discrete Haar domain, without trading off any of its desirable properties, such as compression, separability, orthogonality, and symmetry. The paper presents several key theoretical contributions. First, we derive closed form expressions for phase shifting in the Haar domain both in partially decimated and fully decimated transforms. Second, it is shown that the wavelet coefficients of the shifted signal can be computed solely by using the coefficients of the original transformed signal. Third, we derive closed-form expressions for non-integer shifts, which have not been previously reported in the literature. Fourth, we establish the complexity of the proposed phase shifting approach using the derived analytic expressions. As an application example of these results, we apply the new formulae to image rotation and interpolation, and evaluate its performance against standard methods.
Tasks
Published 2017-05-20
URL http://arxiv.org/abs/1705.07340v1
PDF http://arxiv.org/pdf/1705.07340v1.pdf
PWC https://paperswithcode.com/paper/phase-shifting-separable-haar-wavelets-and
Repo
Framework

Emergence of Grounded Compositional Language in Multi-Agent Populations

Title Emergence of Grounded Compositional Language in Multi-Agent Populations
Authors Igor Mordatch, Pieter Abbeel
Abstract By capturing statistical patterns in large corpora, machine learning has enabled significant advances in natural language processing, including in machine translation, question answering, and sentiment analysis. However, for agents to intelligently interact with humans, simply capturing the statistical patterns is insufficient. In this paper we investigate if, and how, grounded compositional language can emerge as a means to achieve goals in multi-agent populations. Towards this end, we propose a multi-agent learning environment and learning methods that bring about emergence of a basic compositional language. This language is represented as streams of abstract discrete symbols uttered by agents over time, but nonetheless has a coherent structure that possesses a defined vocabulary and syntax. We also observe emergence of non-verbal communication such as pointing and guiding when language communication is unavailable.
Tasks Machine Translation, Question Answering, Sentiment Analysis
Published 2017-03-15
URL http://arxiv.org/abs/1703.04908v2
PDF http://arxiv.org/pdf/1703.04908v2.pdf
PWC https://paperswithcode.com/paper/emergence-of-grounded-compositional-language
Repo
Framework
comments powered by Disqus