May 5, 2019

2910 words 14 mins read

Paper Group ANR 457

Learning recurrent representations for hierarchical behavior modeling. A framework for mining process models from emails logs. Nighttime Haze Removal with Illumination Correction. ReasoNet: Learning to Stop Reading in Machine Comprehension. Fine-scale Surface Normal Estimation using a Single NIR Image. A System for Probabilistic Linking of Thesauri …

Learning recurrent representations for hierarchical behavior modeling


Title	Learning recurrent representations for hierarchical behavior modeling
Authors	Eyrun Eyjolfsdottir, Kristin Branson, Yisong Yue, Pietro Perona
Abstract	We propose a framework for detecting action patterns from motion sequences and modeling the sensory-motor relationship of animals, using a generative recurrent neural network. The network has a discriminative part (classifying actions) and a generative part (predicting motion), whose recurrent cells are laterally connected, allowing higher levels of the network to represent high level phenomena. We test our framework on two types of data, fruit fly behavior and online handwriting. Our results show that 1) taking advantage of unlabeled sequences, by predicting future motion, significantly improves action detection performance when training labels are scarce, 2) the network learns to represent high level phenomena such as writer identity and fly gender, without supervision, and 3) simulated motion trajectories, generated by treating motion prediction as input to the network, look realistic and may be used to qualitatively evaluate whether the model has learnt generative control rules.
Tasks	Action Detection, motion prediction
Published	2016-11-01
URL	http://arxiv.org/abs/1611.00094v3
PDF	http://arxiv.org/pdf/1611.00094v3.pdf
PWC	https://paperswithcode.com/paper/learning-recurrent-representations-for
Repo
Framework

A framework for mining process models from emails logs


Title	A framework for mining process models from emails logs
Authors	Diana Jlailaty, Daniela Grigori, Khalid Belhajjame
Abstract	Due to its wide use in personal, but most importantly, professional contexts, email represents a valuable source of information that can be harvested for understanding, reengineering and repurposing undocumented business processes of companies and institutions. Towards this aim, a few researchers investigated the problem of extracting process oriented information from email logs in order to take benefit of the many available process mining techniques and tools. In this paper we go further in this direction, by proposing a new method for mining process models from email logs that leverage unsupervised machine learning techniques with little human involvement. Moreover, our method allows to semi-automatically label emails with activity names, that can be used for activity recognition in new incoming emails. A use case demonstrates the usefulness of the proposed solution using a modest in size, yet real-world, dataset containing emails that belong to two different process models.
Tasks	Activity Recognition
Published	2016-09-20
URL	http://arxiv.org/abs/1609.06127v1
PDF	http://arxiv.org/pdf/1609.06127v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-mining-process-models-from
Repo
Framework

Nighttime Haze Removal with Illumination Correction


Title	Nighttime Haze Removal with Illumination Correction
Authors	Jing Zhang, Yang Cao, Zengfu Wang
Abstract	Haze removal is important for computational photography and computer vision applications. However, most of the existing methods for dehazing are designed for daytime images, and cannot always work well in the nighttime. Different from the imaging conditions in the daytime, images captured in nighttime haze condition may suffer from non-uniform illumination due to artificial light sources, which exhibit low brightness/contrast and color distortion. In this paper, we present a new nighttime hazy imaging model that takes into account both the non-uniform illumination from artificial light sources and the scattering and attenuation effects of haze. Accordingly, we propose an efficient dehazing algorithm for nighttime hazy images. The proposed algorithm includes three sequential steps. i) It enhances the overall brightness by performing a gamma correction step after estimating the illumination from the original image. ii) Then it achieves a color-balance result by performing a color correction step after estimating the color characteristics of the incident light. iii) Finally, it remove the haze effect by applying the dark channel prior and estimating the point-wise environmental light based on the previous illumination-balance result. Experimental results show that the proposed algorithm can achieve illumination-balance and haze-free results with good color rendition ability.
Tasks
Published	2016-06-05
URL	http://arxiv.org/abs/1606.01460v1
PDF	http://arxiv.org/pdf/1606.01460v1.pdf
PWC	https://paperswithcode.com/paper/nighttime-haze-removal-with-illumination
Repo
Framework

ReasoNet: Learning to Stop Reading in Machine Comprehension


Title	ReasoNet: Learning to Stop Reading in Machine Comprehension
Authors	Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen
Abstract	Teaching a computer to read and answer general questions pertaining to a document is a challenging yet unsolved problem. In this paper, we describe a novel neural network architecture called the Reasoning Network (ReasoNet) for machine comprehension tasks. ReasoNets make use of multiple turns to effectively exploit and then reason over the relation among queries, documents, and answers. Different from previous approaches using a fixed number of turns during inference, ReasoNets introduce a termination state to relax this constraint on the reasoning depth. With the use of reinforcement learning, ReasoNets can dynamically determine whether to continue the comprehension process after digesting intermediate results, or to terminate reading when it concludes that existing information is adequate to produce an answer. ReasoNets have achieved exceptional performance in machine comprehension datasets, including unstructured CNN and Daily Mail datasets, the Stanford SQuAD dataset, and a structured Graph Reachability dataset.
Tasks	Question Answering, Reading Comprehension
Published	2016-09-17
URL	http://arxiv.org/abs/1609.05284v3
PDF	http://arxiv.org/pdf/1609.05284v3.pdf
PWC	https://paperswithcode.com/paper/reasonet-learning-to-stop-reading-in-machine
Repo
Framework

Fine-scale Surface Normal Estimation using a Single NIR Image


Title	Fine-scale Surface Normal Estimation using a Single NIR Image
Authors	Youngjin Yoon, Gyeongmin Choe, Namil Kim, Joon-Young Lee, In So Kweon
Abstract	We present surface normal estimation using a single near infrared (NIR) image. We are focusing on fine-scale surface geometry captured with an uncalibrated light source. To tackle this ill-posed problem, we adopt a generative adversarial network which is effective in recovering a sharp output, which is also essential for fine-scale surface normal estimation. We incorporate angular error and integrability constraint into the objective function of the network to make estimated normals physically meaningful. We train and validate our network on a recent NIR dataset, and also evaluate the generality of our trained model by using new external datasets which are captured with a different camera under different environment.
Tasks
Published	2016-03-24
URL	http://arxiv.org/abs/1603.07475v1
PDF	http://arxiv.org/pdf/1603.07475v1.pdf
PWC	https://paperswithcode.com/paper/fine-scale-surface-normal-estimation-using-a
Repo
Framework

A System for Probabilistic Linking of Thesauri and Classification Systems


Title	A System for Probabilistic Linking of Thesauri and Classification Systems
Authors	Lisa Posch, Philipp Schaer, Arnim Bleier, Markus Strohmaier
Abstract	This paper presents a system which creates and visualizes probabilistic semantic links between concepts in a thesaurus and classes in a classification system. For creating the links, we build on the Polylingual Labeled Topic Model (PLL-TM). PLL-TM identifies probable thesaurus descriptors for each class in the classification system by using information from the natural language text of documents, their assigned thesaurus descriptors and their designated classes. The links are then presented to users of the system in an interactive visualization, providing them with an automatically generated overview of the relations between the thesaurus and the classification system.
Tasks
Published	2016-03-21
URL	http://arxiv.org/abs/1603.06485v1
PDF	http://arxiv.org/pdf/1603.06485v1.pdf
PWC	https://paperswithcode.com/paper/a-system-for-probabilistic-linking-of
Repo
Framework

Resource Planning For Rescue Operations


Title	Resource Planning For Rescue Operations
Authors	Mona Khaffaf, Arshia Khaffaf
Abstract	After an earthquake, disaster sites pose a multitude of health and safety concerns. A rescue operation of people trapped in the ruins after an earthquake disaster requires a series of intelligent behavior, including planning. For a successful rescue operation, given a limited number of available actions and regulations, the role of planning in rescue operations is crucial. Fortunately, recent developments in automated planning by artificial intelligence community can help different organization in this crucial task. Due to the number of rules and regulations, we believe that a rule based system for planning can be helpful for this specific planning problem. In this research work, we use logic rules to represent rescue and related regular regulations, together with a logic based planner to solve this complicated problem. Although this research is still in the prototyping and modeling stage, it clearly shows that rule based languages can be a good infrastructure for this computational task. The results of this research can be used by different organizations, such as Iranian Red Crescent Society and International Institute of Seismology and Earthquake Engineering (IISEE).
Tasks
Published	2016-07-14
URL	http://arxiv.org/abs/1607.03979v1
PDF	http://arxiv.org/pdf/1607.03979v1.pdf
PWC	https://paperswithcode.com/paper/resource-planning-for-rescue-operations
Repo
Framework

Transitive Hashing Network for Heterogeneous Multimedia Retrieval


Title	Transitive Hashing Network for Heterogeneous Multimedia Retrieval
Authors	Zhangjie Cao, Mingsheng Long, Qiang Yang
Abstract	Hashing has been widely applied to large-scale multimedia retrieval due to the storage and retrieval efficiency. Cross-modal hashing enables efficient retrieval from database of one modality in response to a query of another modality. Existing work on cross-modal hashing assumes heterogeneous relationship across modalities for hash function learning. In this paper, we relax the strong assumption by only requiring such heterogeneous relationship in an auxiliary dataset different from the query/database domain. We craft a hybrid deep architecture to simultaneously learn the cross-modal correlation from the auxiliary dataset, and align the dataset distributions between the auxiliary dataset and the query/database domain, which generates transitive hash codes for heterogeneous multimedia retrieval. Extensive experiments exhibit that the proposed approach yields state of the art multimedia retrieval performance on public datasets, i.e. NUS-WIDE, ImageNet-YahooQA.
Tasks
Published	2016-08-15
URL	http://arxiv.org/abs/1608.04307v1
PDF	http://arxiv.org/pdf/1608.04307v1.pdf
PWC	https://paperswithcode.com/paper/transitive-hashing-network-for-heterogeneous
Repo
Framework

Grounded Lexicon Acquisition - Case Studies in Spatial Language


Title	Grounded Lexicon Acquisition - Case Studies in Spatial Language
Authors	Michael Spranger
Abstract	This paper discusses grounded acquisition experiments of increasing complexity. Humanoid robots acquire English spatial lexicons from robot tutors. We identify how various spatial language systems, such as projective, absolute and proximal can be learned. The proposed learning mechanisms do not rely on direct meaning transfer or direct access to world models of interlocutors. Finally, we show how multiple systems can be acquired at the same time.
Tasks
Published	2016-07-26
URL	http://arxiv.org/abs/1607.07630v1
PDF	http://arxiv.org/pdf/1607.07630v1.pdf
PWC	https://paperswithcode.com/paper/grounded-lexicon-acquisition-case-studies-in
Repo
Framework

One-Class Slab Support Vector Machine


Title	One-Class Slab Support Vector Machine
Authors	Victor Fragoso, Walter Scheirer, Joao Hespanha, Matthew Turk
Abstract	This work introduces the one-class slab SVM (OCSSVM), a one-class classifier that aims at improving the performance of the one-class SVM. The proposed strategy reduces the false positive rate and increases the accuracy of detecting instances from novel classes. To this end, it uses two parallel hyperplanes to learn the normal region of the decision scores of the target class. OCSSVM extends one-class SVM since it can scale and learn non-linear decision functions via kernel methods. The experiments on two publicly available datasets show that OCSSVM can consistently outperform the one-class SVM and perform comparable to or better than other state-of-the-art one-class classifiers.
Tasks	One-class classifier
Published	2016-08-02
URL	http://arxiv.org/abs/1608.01026v1
PDF	http://arxiv.org/pdf/1608.01026v1.pdf
PWC	https://paperswithcode.com/paper/one-class-slab-support-vector-machine
Repo
Framework

Generalized Mirror Descents in Congestion Games


Title	Generalized Mirror Descents in Congestion Games
Authors	Po-An Chen, Chi-Jen Lu
Abstract	Different types of dynamics have been studied in repeated game play, and one of them which has received much attention recently consists of those based on “no-regret” algorithms from the area of machine learning. It is known that dynamics based on generic no-regret algorithms may not converge to Nash equilibria in general, but to a larger set of outcomes, namely coarse correlated equilibria. Moreover, convergence results based on generic no-regret algorithms typically use a weaker notion of convergence: the convergence of the average plays instead of the actual plays. Some work has been done showing that when using a specific no-regret algorithm, the well-known multiplicative updates algorithm, convergence of actual plays to equilibria can be shown and better quality of outcomes in terms of the price of anarchy can be reached for atomic congestion games and load balancing games. Are there more cases of natural no-regret dynamics that perform well in suitable classes of games in terms of convergence and quality of outcomes that the dynamics converge to? We answer this question positively in the bulletin-board model by showing that when employing the mirror-descent algorithm, a well-known generic no-regret algorithm, the actual plays converge quickly to equilibria in nonatomic congestion games. Furthermore, the bandit model considers a probably more realistic and prevalent setting with only partial information, in which at each time step each player only knows the cost of her own currently played strategy, but not any costs of unplayed strategies. For the class of atomic congestion games, we propose a family of bandit algorithms based on the mirror-descent algorithms previously presented, and show that when each player individually adopts such a bandit algorithm, their joint (mixed) strategy profile quickly converges with implications.
Tasks
Published	2016-05-25
URL	http://arxiv.org/abs/1605.07774v2
PDF	http://arxiv.org/pdf/1605.07774v2.pdf
PWC	https://paperswithcode.com/paper/generalized-mirror-descents-in-congestion
Repo
Framework

Joint Dimensionality Reduction for Two Feature Vectors


Title	Joint Dimensionality Reduction for Two Feature Vectors
Authors	Yanjun Li, Yoram Bresler
Abstract	Many machine learning problems, especially multi-modal learning problems, have two sets of distinct features (e.g., image and text features in news story classification, or neuroimaging data and neurocognitive data in cognitive science research). This paper addresses the joint dimensionality reduction of two feature vectors in supervised learning problems. In particular, we assume a discriminative model where low-dimensional linear embeddings of the two feature vectors are sufficient statistics for predicting a dependent variable. We show that a simple algorithm involving singular value decomposition can accurately estimate the embeddings provided that certain sample complexities are satisfied, without specifying the nonlinear link function (regressor or classifier). The main results establish sample complexities under multiple settings. Sample complexities for different link functions only differ by constant factors.
Tasks	Dimensionality Reduction
Published	2016-02-13
URL	http://arxiv.org/abs/1602.04398v3
PDF	http://arxiv.org/pdf/1602.04398v3.pdf
PWC	https://paperswithcode.com/paper/joint-dimensionality-reduction-for-two
Repo
Framework

Real-Time Facial Segmentation and Performance Capture from RGB Input


Title	Real-Time Facial Segmentation and Performance Capture from RGB Input
Authors	Shunsuke Saito, Tianye Li, Hao Li
Abstract	We introduce the concept of unconstrained real-time 3D facial performance capture through explicit semantic segmentation in the RGB input. To ensure robustness, cutting edge supervised learning approaches rely on large training datasets of face images captured in the wild. While impressive tracking quality has been demonstrated for faces that are largely visible, any occlusion due to hair, accessories, or hand-to-face gestures would result in significant visual artifacts and loss of tracking accuracy. The modeling of occlusions has been mostly avoided due to its immense space of appearance variability. To address this curse of high dimensionality, we perform tracking in unconstrained images assuming non-face regions can be fully masked out. Along with recent breakthroughs in deep learning, we demonstrate that pixel-level facial segmentation is possible in real-time by repurposing convolutional neural networks designed originally for general semantic segmentation. We develop an efficient architecture based on a two-stream deconvolution network with complementary characteristics, and introduce carefully designed training samples and data augmentation strategies for improved segmentation accuracy and robustness. We adopt a state-of-the-art regression-based facial tracking framework with segmented face images as training, and demonstrate accurate and uninterrupted facial performance capture in the presence of extreme occlusion and even side views. Furthermore, the resulting segmentation can be directly used to composite partial 3D face models on the input images and enable seamless facial manipulation tasks, such as virtual make-up or face replacement.
Tasks	Data Augmentation, Semantic Segmentation
Published	2016-04-10
URL	http://arxiv.org/abs/1604.02647v1
PDF	http://arxiv.org/pdf/1604.02647v1.pdf
PWC	https://paperswithcode.com/paper/real-time-facial-segmentation-and-performance
Repo
Framework

Urban MV and LV Distribution Grid Topology Estimation via Group Lasso


Title	Urban MV and LV Distribution Grid Topology Estimation via Group Lasso
Authors	Yizheng Liao, Yang Weng, Guangyi Liu, Ram Rajagopal
Abstract	The increasing penetration of distributed energy resources poses numerous reliability issues to the urban distribution grid. The topology estimation is a critical step to ensure the robustness of distribution grid operation. However, the bus connectivity and grid topology estimation are usually hard in distribution grids. For example, it is technically challenging and costly to monitor the bus connectivity in urban grids, e.g., underground lines. It is also inappropriate to use the radial topology assumption exclusively because the grids of metropolitan cities and regions with dense loads could be with many mesh structures. To resolve these drawbacks, we propose a data-driven topology estimation method for MV and LV distribution grids by only utilizing the historical smart meter measurements. Particularly, a probabilistic graphical model is utilized to capture the statistical dependencies amongst bus voltages. We prove that the bus connectivity and grid topology estimation problems, in radial and mesh structures, can be formulated as a linear regression with a least absolute shrinkage regularization on grouped variables (\textit{group lasso}). Simulations show highly accurate results in eight MV and LV distribution networks at different sizes and 22 topology configurations using PG&E residential smart meter data.
Tasks
Published	2016-11-06
URL	http://arxiv.org/abs/1611.01845v2
PDF	http://arxiv.org/pdf/1611.01845v2.pdf
PWC	https://paperswithcode.com/paper/urban-mv-and-lv-distribution-grid-topology
Repo
Framework

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection


Title	Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection
Authors	Sergey Levine, Peter Pastor, Alex Krizhevsky, Deirdre Quillen
Abstract	We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independently of camera calibration or the current robot pose. This requires the network to observe the spatial relationship between the gripper and objects in the scene, thus learning hand-eye coordination. We then use this network to servo the gripper in real time to achieve successful grasps. To train our network, we collected over 800,000 grasp attempts over the course of two months, using between 6 and 14 robotic manipulators at any given time, with differences in camera placement and hardware. Our experimental evaluation demonstrates that our method achieves effective real-time control, can successfully grasp novel objects, and corrects mistakes by continuous servoing.
Tasks	Calibration, Robotic Grasping
Published	2016-03-07
URL	http://arxiv.org/abs/1603.02199v4
PDF	http://arxiv.org/pdf/1603.02199v4.pdf
PWC	https://paperswithcode.com/paper/learning-hand-eye-coordination-for-robotic
Repo
Framework