July 29, 2019

2831 words 14 mins read

Paper Group ANR 80

Paper Group ANR 80

Intensity Video Guided 4D Fusion for Improved Highly Dynamic 3D Reconstruction. Generalized Convolutional Neural Networks for Point Cloud Data. Dataset for the First Evaluation on Chinese Machine Reading Comprehension. O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis. Random Forests, Decision Trees, and Categorical Predictors …

Intensity Video Guided 4D Fusion for Improved Highly Dynamic 3D Reconstruction

Title Intensity Video Guided 4D Fusion for Improved Highly Dynamic 3D Reconstruction
Authors Jie Zhang, Christos Maniatis, Luis Horna, Robert B. Fisher
Abstract The availability of high-speed 3D video sensors has greatly facilitated 3D shape acquisition of dynamic and deformable objects, but high frame rate 3D reconstruction is always degraded by spatial noise and temporal fluctuations. This paper presents a simple yet powerful intensity video guided multi-frame 4D fusion pipeline. Temporal tracking of intensity image points (of moving and deforming objects) allows registration of the corresponding 3D data points, whose 3D noise and fluctuations are then reduced by spatio-temporal multi-frame 4D fusion. We conducted simulated noise tests and real experiments on four 3D objects using a 1000 fps 3D video sensor. The results demonstrate that the proposed algorithm is effective at reducing 3D noise and is robust against intensity noise. It outperforms existing algorithms with good scalability on both stationary and dynamic objects.
Tasks 3D Reconstruction
Published 2017-08-06
URL http://arxiv.org/abs/1708.01946v1
PDF http://arxiv.org/pdf/1708.01946v1.pdf
PWC https://paperswithcode.com/paper/intensity-video-guided-4d-fusion-for-improved
Repo
Framework

Generalized Convolutional Neural Networks for Point Cloud Data

Title Generalized Convolutional Neural Networks for Point Cloud Data
Authors Aleksandr Savchenkov, Andrew Davis, Xuan Zhao
Abstract The introduction of cheap RGB-D cameras, stereo cameras, and LIDAR devices has given the computer vision community 3D information that conventional RGB cameras cannot provide. This data is often stored as a point cloud. In this paper, we present a novel method to apply the concept of convolutional neural networks to this type of data. By creating a mapping of nearest neighbors in a dataset, and individually applying weights to spatial relationships between points, we achieve an architecture that works directly with point clouds, but closely resembles a convolutional neural net in both design and behavior. Such a method bypasses the need for extensive feature engineering, while proving to be computationally efficient and requiring few parameters.
Tasks Feature Engineering
Published 2017-07-20
URL http://arxiv.org/abs/1707.06719v2
PDF http://arxiv.org/pdf/1707.06719v2.pdf
PWC https://paperswithcode.com/paper/generalized-convolutional-neural-networks-for
Repo
Framework

Dataset for the First Evaluation on Chinese Machine Reading Comprehension

Title Dataset for the First Evaluation on Chinese Machine Reading Comprehension
Authors Yiming Cui, Ting Liu, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu
Abstract Machine Reading Comprehension (MRC) has become enormously popular recently and has attracted a lot of attention. However, existing reading comprehension datasets are mostly in English. To add diversity in reading comprehension datasets, in this paper we propose a new Chinese reading comprehension dataset for accelerating related research in the community. The proposed dataset contains two different types: cloze-style reading comprehension and user query reading comprehension, associated with large-scale training data as well as human-annotated validation and hidden test set. Along with this dataset, we also hosted the first Evaluation on Chinese Machine Reading Comprehension (CMRC-2017) and successfully attracted tens of participants, which suggest the potential impact of this dataset.
Tasks Machine Reading Comprehension, Reading Comprehension
Published 2017-09-25
URL http://arxiv.org/abs/1709.08299v2
PDF http://arxiv.org/pdf/1709.08299v2.pdf
PWC https://paperswithcode.com/paper/dataset-for-the-first-evaluation-on-chinese
Repo
Framework

O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis

Title O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis
Authors Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong
Abstract We present O-CNN, an Octree-based Convolutional Neural Network (CNN) for 3D shape analysis. Built upon the octree representation of 3D shapes, our method takes the average normal vectors of a 3D model sampled in the finest leaf octants as input and performs 3D CNN operations on the octants occupied by the 3D shape surface. We design a novel octree data structure to efficiently store the octant information and CNN features into the graphics memory and execute the entire O-CNN training and evaluation on the GPU. O-CNN supports various CNN structures and works for 3D shapes in different representations. By restraining the computations on the octants occupied by 3D surfaces, the memory and computational costs of the O-CNN grow quadratically as the depth of the octree increases, which makes the 3D CNN feasible for high-resolution 3D models. We compare the performance of the O-CNN with other existing 3D CNN solutions and demonstrate the efficiency and efficacy of O-CNN in three shape analysis tasks, including object classification, shape retrieval, and shape segmentation.
Tasks 3D Shape Analysis, Object Classification
Published 2017-12-05
URL http://arxiv.org/abs/1712.01537v1
PDF http://arxiv.org/pdf/1712.01537v1.pdf
PWC https://paperswithcode.com/paper/o-cnn-octree-based-convolutional-neural
Repo
Framework

Random Forests, Decision Trees, and Categorical Predictors: The “Absent Levels” Problem

Title Random Forests, Decision Trees, and Categorical Predictors: The “Absent Levels” Problem
Authors Timothy C. Au
Abstract One advantage of decision tree based methods like random forests is their ability to natively handle categorical predictors without having to first transform them (e.g., by using feature engineering techniques). However, in this paper, we show how this capability can lead to an inherent “absent levels” problem for decision tree based methods that has never been thoroughly discussed, and whose consequences have never been carefully explored. This problem occurs whenever there is an indeterminacy over how to handle an observation that has reached a categorical split which was determined when the observation in question’s level was absent during training. Although these incidents may appear to be innocuous, by using Leo Breiman and Adele Cutler’s random forests FORTRAN code and the randomForest R package (Liaw and Wiener, 2002) as motivating case studies, we examine how overlooking the absent levels problem can systematically bias a model. Furthermore, by using three real data examples, we illustrate how absent levels can dramatically alter a model’s performance in practice, and we empirically demonstrate how some simple heuristics can be used to help mitigate the effects of the absent levels problem until a more robust theoretical solution is found.
Tasks Feature Engineering
Published 2017-06-12
URL http://arxiv.org/abs/1706.03492v2
PDF http://arxiv.org/pdf/1706.03492v2.pdf
PWC https://paperswithcode.com/paper/random-forests-decision-trees-and-categorical
Repo
Framework

Exploit imaging through opaque wall via deep learning

Title Exploit imaging through opaque wall via deep learning
Authors Meng Lyu, Hao Wang, Guowei Li, Guohai Situ
Abstract Imaging through scattering media is encountered in many disciplines or sciences, ranging from biology, mesescopic physics and astronomy. But it is still a big challenge because light suffers from multiple scattering is such media and can be totally decorrelated. Here, we propose a deep-learning-based method that can retrieve the image of a target behind a thick scattering medium. The method uses a trained deep neural network to fit the way of mapping of objects at one side of a thick scattering medium to the corresponding speckle patterns observed at the other side. For demonstration, we retrieve the images of a set of objects hidden behind a 3mm thick white polystyrene slab, the optical depth of which is 13.4 times of the scattering mean free path. Our work opens up a new way to tackle the longstanding challenge by using the technique of deep learning.
Tasks
Published 2017-08-09
URL http://arxiv.org/abs/1708.07881v1
PDF http://arxiv.org/pdf/1708.07881v1.pdf
PWC https://paperswithcode.com/paper/exploit-imaging-through-opaque-wall-via-deep
Repo
Framework

A Simple and Realistic Pedestrian Model for Crowd Simulation and Application

Title A Simple and Realistic Pedestrian Model for Crowd Simulation and Application
Authors Wonho Kang, Youngnam Han
Abstract The simulation of pedestrian crowd that reflects reality is a major challenge for researches. Several crowd simulation models have been proposed such as cellular automata model, agent-based model, fluid dynamic model, etc. It is important to note that agent-based model is able, over others approaches, to provide a natural description of the system and then to capture complex human behaviors. In this paper, we propose a multi-agent simulation model in which pedestrian positions are updated at discrete time intervals. It takes into account the major normal conditions of a simple pedestrian situated in a crowd such as preferences, realistic perception of environment, etc. Our objective is to simulate the pedestrian crowd realistically towards a simulation of believable pedestrian behaviors. Typical pedestrian phenomena, including the unidirectional and bidirectional movement in a corridor as well as the flow through bottleneck, are simulated. The conducted simulations show that our model is able to produce realistic pedestrian behaviors. The obtained fundamental diagram and flow rate at bottleneck agree very well with classic conclusions and empirical study results. It is hoped that the idea of this study may be helpful in promoting the modeling and simulation of pedestrian crowd in a simple way.
Tasks
Published 2017-08-10
URL http://arxiv.org/abs/1708.03080v2
PDF http://arxiv.org/pdf/1708.03080v2.pdf
PWC https://paperswithcode.com/paper/a-simple-and-realistic-pedestrian-model-for
Repo
Framework

Multi-modal Conditional Attention Fusion for Dimensional Emotion Prediction

Title Multi-modal Conditional Attention Fusion for Dimensional Emotion Prediction
Authors Shizhe Chen, Qin Jin
Abstract Continuous dimensional emotion prediction is a challenging task where the fusion of various modalities usually achieves state-of-the-art performance such as early fusion or late fusion. In this paper, we propose a novel multi-modal fusion strategy named conditional attention fusion, which can dynamically pay attention to different modalities at each time step. Long-short term memory recurrent neural networks (LSTM-RNN) is applied as the basic uni-modality model to capture long time dependencies. The weights assigned to different modalities are automatically decided by the current input features and recent history information rather than being fixed at any kinds of situation. Our experimental results on a benchmark dataset AVEC2015 show the effectiveness of our method which outperforms several common fusion strategies for valence prediction.
Tasks
Published 2017-09-04
URL http://arxiv.org/abs/1709.02251v1
PDF http://arxiv.org/pdf/1709.02251v1.pdf
PWC https://paperswithcode.com/paper/multi-modal-conditional-attention-fusion-for
Repo
Framework

A Deep Reinforcement Learning Chatbot

Title A Deep Reinforcement Learning Chatbot
Authors Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeshwar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, Yoshua Bengio
Abstract We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.
Tasks Chatbot, Text Generation
Published 2017-09-07
URL http://arxiv.org/abs/1709.02349v2
PDF http://arxiv.org/pdf/1709.02349v2.pdf
PWC https://paperswithcode.com/paper/a-deep-reinforcement-learning-chatbot
Repo
Framework

Imposing Hard Constraints on Deep Networks: Promises and Limitations

Title Imposing Hard Constraints on Deep Networks: Promises and Limitations
Authors Pablo Márquez-Neila, Mathieu Salzmann, Pascal Fua
Abstract Imposing constraints on the output of a Deep Neural Net is one way to improve the quality of its predictions while loosening the requirements for labeled training data. Such constraints are usually imposed as soft constraints by adding new terms to the loss function that is minimized during training. An alternative is to impose them as hard constraints, which has a number of theoretical benefits but has not been explored so far due to the perceived intractability of the problem. In this paper, we show that imposing hard constraints can in fact be done in a computationally feasible way and delivers reasonable results. However, the theoretical benefits do not materialize and the resulting technique is no better than existing ones relying on soft constraints. We analyze the reasons for this and hope to spur other researchers into proposing better solutions.
Tasks
Published 2017-06-07
URL http://arxiv.org/abs/1706.02025v1
PDF http://arxiv.org/pdf/1706.02025v1.pdf
PWC https://paperswithcode.com/paper/imposing-hard-constraints-on-deep-networks
Repo
Framework

Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing

Title Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing
Authors Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
Abstract Recently, encoder-decoder neural networks have shown impressive performance on many sequence-related tasks. The architecture commonly uses an attentional mechanism which allows the model to learn alignments between the source and the target sequence. Most attentional mechanisms used today is based on a global attention property which requires a computation of a weighted summarization of the whole input sequence generated by encoder states. However, it is computationally expensive and often produces misalignment on the longer input sequence. Furthermore, it does not fit with monotonous or left-to-right nature in several tasks, such as automatic speech recognition (ASR), grapheme-to-phoneme (G2P), etc. In this paper, we propose a novel attention mechanism that has local and monotonic properties. Various ways to control those properties are also explored. Experimental results on ASR, G2P and machine translation between two languages with similar sentence structures, demonstrate that the proposed encoder-decoder model with local monotonic attention could achieve significant performance improvements and reduce the computational complexity in comparison with the one that used the standard global attention architecture.
Tasks Machine Translation, Speech Recognition
Published 2017-05-23
URL http://arxiv.org/abs/1705.08091v2
PDF http://arxiv.org/pdf/1705.08091v2.pdf
PWC https://paperswithcode.com/paper/local-monotonic-attention-mechanism-for-end
Repo
Framework

An All-Pair Quantum SVM Approach for Big Data Multiclass Classification

Title An All-Pair Quantum SVM Approach for Big Data Multiclass Classification
Authors Arit Kumar Bishwas, Ashish Mani, Vasile Palade
Abstract In this paper, we have discussed a quantum approach for the all-pair multiclass classification problem. We have shown that the multiclass support vector machine for big data classification with a quantum all-pair approach can be implemented in logarithm runtime complexity on a quantum computer. In an all-pair approach, there is one binary classification problem for each pair of classes, and so there are k (k-1)/2 classifiers for a k-class problem. As compared to the classical multiclass support vector machine that can be implemented with polynomial run time complexity, our approach exhibits exponential speed up in the quantum version. The quantum all-pair algorithm can be used with other classification algorithms, and a speed up gain can be achieved as compared to their classical counterparts.
Tasks
Published 2017-04-25
URL http://arxiv.org/abs/1704.07664v2
PDF http://arxiv.org/pdf/1704.07664v2.pdf
PWC https://paperswithcode.com/paper/an-all-pair-quantum-svm-approach-for-big-data
Repo
Framework

Nonparametric weighted stochastic block models

Title Nonparametric weighted stochastic block models
Authors Tiago P. Peixoto
Abstract We present a Bayesian formulation of weighted stochastic block models that can be used to infer the large-scale modular structure of weighted networks, including their hierarchical organization. Our method is nonparametric, and thus does not require the prior knowledge of the number of groups or other dimensions of the model, which are instead inferred from data. We give a comprehensive treatment of different kinds of edge weights (i.e. continuous or discrete, signed or unsigned, bounded or unbounded), as well as arbitrary weight transformations, and describe an unsupervised model selection approach to choose the best network description. We illustrate the application of our method to a variety of empirical weighted networks, such as global migrations, voting patterns in congress, and neural connections in the human brain.
Tasks Model Selection
Published 2017-08-04
URL http://arxiv.org/abs/1708.01432v4
PDF http://arxiv.org/pdf/1708.01432v4.pdf
PWC https://paperswithcode.com/paper/nonparametric-weighted-stochastic-block
Repo
Framework

Bidirectional Conditional Generative Adversarial Networks

Title Bidirectional Conditional Generative Adversarial Networks
Authors Ayush Jaiswal, Wael AbdAlmageed, Yue Wu, Premkumar Natarajan
Abstract Conditional Generative Adversarial Networks (cGANs) are generative models that can produce data samples ($x$) conditioned on both latent variables ($z$) and known auxiliary information ($c$). We propose the Bidirectional cGAN (BiCoGAN), which effectively disentangles $z$ and $c$ in the generation process and provides an encoder that learns inverse mappings from $x$ to both $z$ and $c$, trained jointly with the generator and the discriminator. We present crucial techniques for training BiCoGANs, which involve an extrinsic factor loss along with an associated dynamically-tuned importance weight. As compared to other encoder-based cGANs, BiCoGANs encode $c$ more accurately, and utilize $z$ and $c$ more effectively and in a more disentangled way to generate samples.
Tasks
Published 2017-11-20
URL http://arxiv.org/abs/1711.07461v4
PDF http://arxiv.org/pdf/1711.07461v4.pdf
PWC https://paperswithcode.com/paper/bidirectional-conditional-generative
Repo
Framework

A comparative study of breast surface reconstruction for aesthetic outcome assessment

Title A comparative study of breast surface reconstruction for aesthetic outcome assessment
Authors Rene Lacher, Francisco Vasconcelos, David Bishop, Norman Williams, Mohammed Keshtgar, David Hawkes, John Hipwell, Danail Stoyanov
Abstract Breast cancer is the most prevalent cancer type in women, and while its survival rate is generally high the aesthetic outcome is an increasingly important factor when evaluating different treatment alternatives. 3D scanning and reconstruction techniques offer a flexible tool for building detailed and accurate 3D breast models that can be used both pre-operatively for surgical planning and post-operatively for aesthetic evaluation. This paper aims at comparing the accuracy of low-cost 3D scanning technologies with the significantly more expensive state-of-the-art 3D commercial scanners in the context of breast 3D reconstruction. We present results from 28 synthetic and clinical RGBD sequences, including 12 unique patients and an anthropomorphic phantom demonstrating the applicability of low-cost RGBD sensors to real clinical cases. Body deformation and homogeneous skin texture pose challenges to the studied reconstruction systems. Although these should be addressed appropriately if higher model quality is warranted, we observe that low-cost sensors are able to obtain valuable reconstructions comparable to the state-of-the-art within an error margin of 3 mm.
Tasks 3D Reconstruction
Published 2017-06-20
URL http://arxiv.org/abs/1706.06531v1
PDF http://arxiv.org/pdf/1706.06531v1.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-of-breast-surface
Repo
Framework
comments powered by Disqus