July 28, 2019

2833 words 14 mins read

Paper Group ANR 230

Paper Group ANR 230

Gradient-based Camera Exposure Control for Outdoor Mobile Platforms. A Learning-based Framework for Hybrid Depth-from-Defocus and Stereo Matching. Generalization and Equilibrium in Generative Adversarial Nets (GANs). Occlusion Handling using Semantic Segmentation and Visibility-Based Rendering for Mixed Reality. Shallow Discourse Parsing with Maxim …

Gradient-based Camera Exposure Control for Outdoor Mobile Platforms

Title Gradient-based Camera Exposure Control for Outdoor Mobile Platforms
Authors Inwook Shim, Tae-Hyun Oh, Joon-Young Lee, Jinwook Choi, Dong-Geol Choi, In So Kweon
Abstract We introduce a novel method to automatically adjust camera exposure for image processing and computer vision applications on mobile robot platforms. Because most image processing algorithms rely heavily on low-level image features that are based mainly on local gradient information, we consider that gradient quantity can determine the proper exposure level, allowing a camera to capture the important image features in a manner robust to illumination conditions. We then extend this concept to a multi-camera system and present a new control algorithm to achieve both brightness consistency between adjacent cameras and a proper exposure level for each camera. We implement our prototype system with off-the-shelf machine-vision cameras and demonstrate the effectiveness of the proposed algorithms on practical applications, including pedestrian detection, visual odometry, surround-view imaging, panoramic imaging and stereo matching.
Tasks Pedestrian Detection, Stereo Matching, Stereo Matching Hand, Visual Odometry
Published 2017-08-24
URL http://arxiv.org/abs/1708.07338v3
PDF http://arxiv.org/pdf/1708.07338v3.pdf
PWC https://paperswithcode.com/paper/gradient-based-camera-exposure-control-for
Repo
Framework

A Learning-based Framework for Hybrid Depth-from-Defocus and Stereo Matching

Title A Learning-based Framework for Hybrid Depth-from-Defocus and Stereo Matching
Authors Zhang Chen, Xinqing Guo, Siyuan Li, Xuan Cao, Jingyi Yu
Abstract Depth from defocus (DfD) and stereo matching are two most studied passive depth sensing schemes. The techniques are essentially complementary: DfD can robustly handle repetitive textures that are problematic for stereo matching whereas stereo matching is insensitive to defocus blurs and can handle large depth range. In this paper, we present a unified learning-based technique to conduct hybrid DfD and stereo matching. Our input is image triplets: a stereo pair and a defocused image of one of the stereo views. We first apply depth-guided light field rendering to construct a comprehensive training dataset for such hybrid sensing setups. Next, we adopt the hourglass network architecture to separately conduct depth inference from DfD and stereo. Finally, we exploit different connection methods between the two separate networks for integrating them into a unified solution to produce high fidelity 3D disparity maps. Comprehensive experiments on real and synthetic data show that our new learning-based hybrid 3D sensing technique can significantly improve accuracy and robustness in 3D reconstruction.
Tasks 3D Reconstruction, Stereo Matching, Stereo Matching Hand
Published 2017-08-02
URL http://arxiv.org/abs/1708.00583v3
PDF http://arxiv.org/pdf/1708.00583v3.pdf
PWC https://paperswithcode.com/paper/a-learning-based-framework-for-hybrid-depth
Repo
Framework

Generalization and Equilibrium in Generative Adversarial Nets (GANs)

Title Generalization and Equilibrium in Generative Adversarial Nets (GANs)
Authors Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, Yi Zhang
Abstract We show that training of generative adversarial network (GAN) may not have good generalization properties; e.g., training may appear successful but the trained distribution may be far from target distribution in standard metrics. However, generalization does occur for a weaker metric called neural net distance. It is also shown that an approximate pure equilibrium exists in the discriminator/generator game for a special class of generators with natural training objectives when generator capacity and training set sizes are moderate. This existence of equilibrium inspires MIX+GAN protocol, which can be combined with any existing GAN training, and empirically shown to improve some of them.
Tasks
Published 2017-03-02
URL http://arxiv.org/abs/1703.00573v5
PDF http://arxiv.org/pdf/1703.00573v5.pdf
PWC https://paperswithcode.com/paper/generalization-and-equilibrium-in-generative
Repo
Framework

Occlusion Handling using Semantic Segmentation and Visibility-Based Rendering for Mixed Reality

Title Occlusion Handling using Semantic Segmentation and Visibility-Based Rendering for Mixed Reality
Authors Menandro Roxas, Tomoki Hori, Taiki Fukiage, Yasuhide Okamoto, Takeshi Oishi
Abstract Real-time occlusion handling is a major problem in outdoor mixed reality system because it requires great computational cost mainly due to the complexity of the scene. Using only segmentation, it is difficult to accurately render a virtual object occluded by complex objects such as trees, bushes etc. In this paper, we propose a novel occlusion handling method for real-time, outdoor, and omni-directional mixed reality system using only the information from a monocular image sequence. We first present a semantic segmentation scheme for predicting the amount of visibility for different type of objects in the scene. We also simultaneously calculate a foreground probability map using depth estimation derived from optical flow. Finally, we combine the segmentation result and the probability map to render the computer generated object and the real scene using a visibility-based rendering method. Our results show great improvement in handling occlusions compared to existing blending based methods.
Tasks Depth Estimation, Optical Flow Estimation, Semantic Segmentation
Published 2017-07-30
URL http://arxiv.org/abs/1707.09603v1
PDF http://arxiv.org/pdf/1707.09603v1.pdf
PWC https://paperswithcode.com/paper/occlusion-handling-using-semantic
Repo
Framework

Shallow Discourse Parsing with Maximum Entropy Model

Title Shallow Discourse Parsing with Maximum Entropy Model
Authors Jingjing Xu
Abstract In recent years, more research has been devoted to studying the subtask of the complete shallow discourse parsing, such as indentifying discourse connective and arguments of connective. There is a need to design a full discourse parser to pull these subtasks together. So we develop a discourse parser turning the free text into discourse relations. The parser includes connective identifier, arguments identifier, sense classifier and non-explicit identifier, which connects with each other in pipeline. Each component applies the maximum entropy model with abundant lexical and syntax features extracted from the Penn Discourse Tree-bank. The head-based representation of the PDTB is adopted in the arguments identifier, which turns the problem of indentifying the arguments of discourse connective into finding the head and end of the arguments. In the non-explicit identifier, the contextual type features like words which have high frequency and can reflect the discourse relation are introduced to improve the performance of non-explicit identifier. Compared with other methods, experimental results achieve the considerable performance.
Tasks
Published 2017-10-31
URL http://arxiv.org/abs/1710.11334v1
PDF http://arxiv.org/pdf/1710.11334v1.pdf
PWC https://paperswithcode.com/paper/shallow-discourse-parsing-with-maximum
Repo
Framework

Efficient Dense Labeling of Human Activity Sequences from Wearables using Fully Convolutional Networks

Title Efficient Dense Labeling of Human Activity Sequences from Wearables using Fully Convolutional Networks
Authors Rui Yao, Guosheng Lin, Qinfeng Shi, Damith Ranasinghe
Abstract Recognizing human activities in a sequence is a challenging area of research in ubiquitous computing. Most approaches use a fixed size sliding window over consecutive samples to extract features—either handcrafted or learned features—and predict a single label for all samples in the window. Two key problems emanate from this approach: i) the samples in one window may not always share the same label. Consequently, using one label for all samples within a window inevitably lead to loss of information; ii) the testing phase is constrained by the window size selected during training while the best window size is difficult to tune in practice. We propose an efficient algorithm that can predict the label of each sample, which we call dense labeling, in a sequence of human activities of arbitrary length using a fully convolutional network. In particular, our approach overcomes the problems posed by the sliding window step. Additionally, our algorithm learns both the features and classifier automatically. We release a new daily activity dataset based on a wearable sensor with hospitalized patients. We conduct extensive experiments and demonstrate that our proposed approach is able to outperform the state-of-the-arts in terms of classification and label misalignment measures on three challenging datasets: Opportunity, Hand Gesture, and our new dataset.
Tasks
Published 2017-02-20
URL http://arxiv.org/abs/1702.06212v1
PDF http://arxiv.org/pdf/1702.06212v1.pdf
PWC https://paperswithcode.com/paper/efficient-dense-labeling-of-human-activity
Repo
Framework

Expected Policy Gradients

Title Expected Policy Gradients
Authors Kamil Ciosek, Shimon Whiteson
Abstract We propose expected policy gradients (EPG), which unify stochastic policy gradients (SPG) and deterministic policy gradients (DPG) for reinforcement learning. Inspired by expected sarsa, EPG integrates across the action when estimating the gradient, instead of relying only on the action in the sampled trajectory. We establish a new general policy gradient theorem, of which the stochastic and deterministic policy gradient theorems are special cases. We also prove that EPG reduces the variance of the gradient estimates without requiring deterministic policies and, for the Gaussian case, with no computational overhead. Finally, we show that it is optimal in a certain sense to explore with a Gaussian policy such that the covariance is proportional to the exponential of the scaled Hessian of the critic with respect to the actions. We present empirical results confirming that this new form of exploration substantially outperforms DPG with the Ornstein-Uhlenbeck heuristic in four challenging MuJoCo domains.
Tasks
Published 2017-06-15
URL http://arxiv.org/abs/1706.05374v6
PDF http://arxiv.org/pdf/1706.05374v6.pdf
PWC https://paperswithcode.com/paper/expected-policy-gradients
Repo
Framework

The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems

Title The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems
Authors Weinan E, Bing Yu
Abstract We propose a deep learning based method, the Deep Ritz Method, for numerically solving variational problems, particularly the ones that arise from partial differential equations. The Deep Ritz method is naturally nonlinear, naturally adaptive and has the potential to work in rather high dimensions. The framework is quite simple and fits well with the stochastic gradient descent method used in deep learning. We illustrate the method on several problems including some eigenvalue problems.
Tasks
Published 2017-09-30
URL http://arxiv.org/abs/1710.00211v1
PDF http://arxiv.org/pdf/1710.00211v1.pdf
PWC https://paperswithcode.com/paper/the-deep-ritz-method-a-deep-learning-based
Repo
Framework

Iterative Deep Learning for Network Topology Extraction

Title Iterative Deep Learning for Network Topology Extraction
Authors Carles Ventura, Jordi Pont-Tuset, Sergi Caelles, Kevis-Kokitsi Maninis, Luc Van Gool
Abstract This paper tackles the task of estimating the topology of filamentary networks such as retinal vessels and road networks. Building on top of a global model that performs a dense semantical classification of the pixels of the image, we design a Convolutional Neural Network (CNN) that predicts the local connectivity between the central pixel of an input patch and its border points. By iterating this local connectivity we sweep the whole image and infer the global topology of the filamentary network, inspired by a human delineating a complex network with the tip of their finger. We perform an extensive and comprehensive qualitative and quantitative evaluation on two tasks: retinal veins and arteries topology extraction and road network estimation. In both cases, represented by two publicly available datasets (DRIVE and Massachusetts Roads), we show superior performance to very strong baselines.
Tasks
Published 2017-12-04
URL http://arxiv.org/abs/1712.01217v1
PDF http://arxiv.org/pdf/1712.01217v1.pdf
PWC https://paperswithcode.com/paper/iterative-deep-learning-for-network-topology
Repo
Framework

Revisiting Deep Intrinsic Image Decompositions

Title Revisiting Deep Intrinsic Image Decompositions
Authors Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf
Abstract While invaluable for many computer vision applications, decomposing a natural image into intrinsic reflectance and shading layers represents a challenging, underdetermined inverse problem. As opposed to strict reliance on conventional optimization or filtering solutions with strong prior assumptions, deep learning based approaches have also been proposed to compute intrinsic image decompositions when granted access to sufficient labeled training data. The downside is that current data sources are quite limited, and broadly speaking fall into one of two categories: either dense fully-labeled images in synthetic/narrow settings, or weakly-labeled data from relatively diverse natural scenes. In contrast to many previous learning-based approaches, which are often tailored to the structure of a particular dataset (and may not work well on others), we adopt core network structures that universally reflect loose prior knowledge regarding the intrinsic image formation process and can be largely shared across datasets. We then apply flexibly supervised loss layers that are customized for each source of ground truth labels. The resulting deep architecture achieves state-of-the-art results on all of the major intrinsic image benchmarks, and runs considerably faster than most at test time.
Tasks
Published 2017-01-11
URL http://arxiv.org/abs/1701.02965v8
PDF http://arxiv.org/pdf/1701.02965v8.pdf
PWC https://paperswithcode.com/paper/revisiting-deep-intrinsic-image
Repo
Framework

Modeling Past and Future for Neural Machine Translation

Title Modeling Past and Future for Neural Machine Translation
Authors Zaixiang Zheng, Hao Zhou, Shujian Huang, Lili Mou, Xinyu Dai, Jiajun Chen, Zhaopeng Tu
Abstract Existing neural machine translation systems do not explicitly model what has been translated and what has not during the decoding phase. To address this problem, we propose a novel mechanism that separates the source information into two parts: translated Past contents and untranslated Future contents, which are modeled by two additional recurrent layers. The Past and Future contents are fed to both the attention model and the decoder states, which offers NMT systems the knowledge of translated and untranslated contents. Experimental results show that the proposed approach significantly improves translation performance in Chinese-English, German-English and English-German translation tasks. Specifically, the proposed model outperforms the conventional coverage model in both of the translation quality and the alignment error rate.
Tasks Machine Translation
Published 2017-11-27
URL http://arxiv.org/abs/1711.09502v2
PDF http://arxiv.org/pdf/1711.09502v2.pdf
PWC https://paperswithcode.com/paper/modeling-past-and-future-for-neural-machine
Repo
Framework

A statistical physics approach to learning curves for the Inverse Ising problem

Title A statistical physics approach to learning curves for the Inverse Ising problem
Authors Ludovica Bachschmid-Romano, Manfred Opper
Abstract Using methods of statistical physics, we analyse the error of learning couplings in large Ising models from independent data (the inverse Ising problem). We concentrate on learning based on local cost functions, such as the pseudo-likelihood method for which the couplings are inferred independently for each spin. Assuming that the data are generated from a true Ising model, we compute the reconstruction error of the couplings using a combination of the replica method with the cavity approach for densely connected systems. We show that an explicit estimator based on a quadratic cost function achieves minimal reconstruction error, but requires the length of the true coupling vector as prior knowledge. A simple mean field estimator of the couplings which does not need such knowledge is asymptotically optimal, i.e. when the number of observations is much large than the number of spins. Comparison of the theory with numerical simulations shows excellent agreement for data generated from two models with random couplings in the high temperature region: a model with independent couplings (Sherrington-Kirkpatrick model), and a model where the matrix of couplings has a Wishart distribution.
Tasks
Published 2017-05-15
URL http://arxiv.org/abs/1705.05403v1
PDF http://arxiv.org/pdf/1705.05403v1.pdf
PWC https://paperswithcode.com/paper/a-statistical-physics-approach-to-learning
Repo
Framework

On Coordinate Minimization of Convex Piecewise-Affine Functions

Title On Coordinate Minimization of Convex Piecewise-Affine Functions
Authors Tomas Werner
Abstract A popular class of algorithms to optimize the dual LP relaxation of the discrete energy minimization problem (a.k.a.\ MAP inference in graphical models or valued constraint satisfaction) are convergent message-passing algorithms, such as max-sum diffusion, TRW-S, MPLP and SRMP. These algorithms are successful in practice, despite the fact that they are a version of coordinate minimization applied to a convex piecewise-affine function, which is not guaranteed to converge to a global minimizer. These algorithms converge only to a local minimizer, characterized by local consistency known from constraint programming. We generalize max-sum diffusion to a version of coordinate minimization applicable to an arbitrary convex piecewise-affine function, which converges to a local consistency condition. This condition can be seen as the sign relaxation of the global optimality condition.
Tasks
Published 2017-09-14
URL http://arxiv.org/abs/1709.04989v1
PDF http://arxiv.org/pdf/1709.04989v1.pdf
PWC https://paperswithcode.com/paper/on-coordinate-minimization-of-convex
Repo
Framework

Incremental Learning for Robot Perception through HRI

Title Incremental Learning for Robot Perception through HRI
Authors Sepehr Valipour, Camilo Perez, Martin Jagersand
Abstract Scene understanding and object recognition is a difficult to achieve yet crucial skill for robots. Recently, Convolutional Neural Networks (CNN), have shown success in this task. However, there is still a gap between their performance on image datasets and real-world robotics scenarios. We present a novel paradigm for incrementally improving a robot’s visual perception through active human interaction. In this paradigm, the user introduces novel objects to the robot by means of pointing and voice commands. Given this information, the robot visually explores the object and adds images from it to re-train the perception module. Our base perception module is based on recent development in object detection and recognition using deep learning. Our method leverages state of the art CNNs from off-line batch learning, human guidance, robot exploration and incremental on-line learning.
Tasks Object Detection, Object Recognition, Scene Understanding
Published 2017-01-17
URL http://arxiv.org/abs/1701.04693v1
PDF http://arxiv.org/pdf/1701.04693v1.pdf
PWC https://paperswithcode.com/paper/incremental-learning-for-robot-perception
Repo
Framework

Design Automation for Binarized Neural Networks: A Quantum Leap Opportunity?

Title Design Automation for Binarized Neural Networks: A Quantum Leap Opportunity?
Authors Manuele Rusci, Lukas Cavigelli, Luca Benini
Abstract Design automation in general, and in particular logic synthesis, can play a key role in enabling the design of application-specific Binarized Neural Networks (BNN). This paper presents the hardware design and synthesis of a purely combinational BNN for ultra-low power near-sensor processing. We leverage the major opportunities raised by BNN models, which consist mostly of logical bit-wise operations and integer counting and comparisons, for pushing ultra-low power deep learning circuits close to the sensor and coupling it with binarized mixed-signal image sensor data. We analyze area, power and energy metrics of BNNs synthesized as combinational networks. Our synthesis results in GlobalFoundries 22nm SOI technology shows a silicon area of 2.61mm2 for implementing a combinational BNN with 32x32 binary input sensor receptive field and weight parameters fixed at design time. This is 2.2x smaller than a synthesized network with re-configurable parameters. With respect to other comparable techniques for deep learning near-sensor processing, our approach features a 10x higher energy efficiency.
Tasks
Published 2017-11-21
URL http://arxiv.org/abs/1712.01743v1
PDF http://arxiv.org/pdf/1712.01743v1.pdf
PWC https://paperswithcode.com/paper/design-automation-for-binarized-neural
Repo
Framework
comments powered by Disqus