Paper Group ANR 457
Learning recurrent representations for hierarchical behavior modeling. A framework for mining process models from emails logs. Nighttime Haze Removal with Illumination Correction. ReasoNet: Learning to Stop Reading in Machine Comprehension. Fine-scale Surface Normal Estimation using a Single NIR Image. A System for Probabilistic Linking of Thesauri …
Learning recurrent representations for hierarchical behavior modeling
Title | Learning recurrent representations for hierarchical behavior modeling |
Authors | Eyrun Eyjolfsdottir, Kristin Branson, Yisong Yue, Pietro Perona |
Abstract | We propose a framework for detecting action patterns from motion sequences and modeling the sensory-motor relationship of animals, using a generative recurrent neural network. The network has a discriminative part (classifying actions) and a generative part (predicting motion), whose recurrent cells are laterally connected, allowing higher levels of the network to represent high level phenomena. We test our framework on two types of data, fruit fly behavior and online handwriting. Our results show that 1) taking advantage of unlabeled sequences, by predicting future motion, significantly improves action detection performance when training labels are scarce, 2) the network learns to represent high level phenomena such as writer identity and fly gender, without supervision, and 3) simulated motion trajectories, generated by treating motion prediction as input to the network, look realistic and may be used to qualitatively evaluate whether the model has learnt generative control rules. |
Tasks | Action Detection, motion prediction |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00094v3 |
http://arxiv.org/pdf/1611.00094v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-recurrent-representations-for |
Repo | |
Framework | |
A framework for mining process models from emails logs
Title | A framework for mining process models from emails logs |
Authors | Diana Jlailaty, Daniela Grigori, Khalid Belhajjame |
Abstract | Due to its wide use in personal, but most importantly, professional contexts, email represents a valuable source of information that can be harvested for understanding, reengineering and repurposing undocumented business processes of companies and institutions. Towards this aim, a few researchers investigated the problem of extracting process oriented information from email logs in order to take benefit of the many available process mining techniques and tools. In this paper we go further in this direction, by proposing a new method for mining process models from email logs that leverage unsupervised machine learning techniques with little human involvement. Moreover, our method allows to semi-automatically label emails with activity names, that can be used for activity recognition in new incoming emails. A use case demonstrates the usefulness of the proposed solution using a modest in size, yet real-world, dataset containing emails that belong to two different process models. |
Tasks | Activity Recognition |
Published | 2016-09-20 |
URL | http://arxiv.org/abs/1609.06127v1 |
http://arxiv.org/pdf/1609.06127v1.pdf | |
PWC | https://paperswithcode.com/paper/a-framework-for-mining-process-models-from |
Repo | |
Framework | |
Nighttime Haze Removal with Illumination Correction
Title | Nighttime Haze Removal with Illumination Correction |
Authors | Jing Zhang, Yang Cao, Zengfu Wang |
Abstract | Haze removal is important for computational photography and computer vision applications. However, most of the existing methods for dehazing are designed for daytime images, and cannot always work well in the nighttime. Different from the imaging conditions in the daytime, images captured in nighttime haze condition may suffer from non-uniform illumination due to artificial light sources, which exhibit low brightness/contrast and color distortion. In this paper, we present a new nighttime hazy imaging model that takes into account both the non-uniform illumination from artificial light sources and the scattering and attenuation effects of haze. Accordingly, we propose an efficient dehazing algorithm for nighttime hazy images. The proposed algorithm includes three sequential steps. i) It enhances the overall brightness by performing a gamma correction step after estimating the illumination from the original image. ii) Then it achieves a color-balance result by performing a color correction step after estimating the color characteristics of the incident light. iii) Finally, it remove the haze effect by applying the dark channel prior and estimating the point-wise environmental light based on the previous illumination-balance result. Experimental results show that the proposed algorithm can achieve illumination-balance and haze-free results with good color rendition ability. |
Tasks | |
Published | 2016-06-05 |
URL | http://arxiv.org/abs/1606.01460v1 |
http://arxiv.org/pdf/1606.01460v1.pdf | |
PWC | https://paperswithcode.com/paper/nighttime-haze-removal-with-illumination |
Repo | |
Framework | |
ReasoNet: Learning to Stop Reading in Machine Comprehension
Title | ReasoNet: Learning to Stop Reading in Machine Comprehension |
Authors | Yelong Shen, Po-Sen Huang, Jianfeng Gao, Weizhu Chen |
Abstract | Teaching a computer to read and answer general questions pertaining to a document is a challenging yet unsolved problem. In this paper, we describe a novel neural network architecture called the Reasoning Network (ReasoNet) for machine comprehension tasks. ReasoNets make use of multiple turns to effectively exploit and then reason over the relation among queries, documents, and answers. Different from previous approaches using a fixed number of turns during inference, ReasoNets introduce a termination state to relax this constraint on the reasoning depth. With the use of reinforcement learning, ReasoNets can dynamically determine whether to continue the comprehension process after digesting intermediate results, or to terminate reading when it concludes that existing information is adequate to produce an answer. ReasoNets have achieved exceptional performance in machine comprehension datasets, including unstructured CNN and Daily Mail datasets, the Stanford SQuAD dataset, and a structured Graph Reachability dataset. |
Tasks | Question Answering, Reading Comprehension |
Published | 2016-09-17 |
URL | http://arxiv.org/abs/1609.05284v3 |
http://arxiv.org/pdf/1609.05284v3.pdf | |
PWC | https://paperswithcode.com/paper/reasonet-learning-to-stop-reading-in-machine |
Repo | |
Framework | |
Fine-scale Surface Normal Estimation using a Single NIR Image
Title | Fine-scale Surface Normal Estimation using a Single NIR Image |
Authors | Youngjin Yoon, Gyeongmin Choe, Namil Kim, Joon-Young Lee, In So Kweon |
Abstract | We present surface normal estimation using a single near infrared (NIR) image. We are focusing on fine-scale surface geometry captured with an uncalibrated light source. To tackle this ill-posed problem, we adopt a generative adversarial network which is effective in recovering a sharp output, which is also essential for fine-scale surface normal estimation. We incorporate angular error and integrability constraint into the objective function of the network to make estimated normals physically meaningful. We train and validate our network on a recent NIR dataset, and also evaluate the generality of our trained model by using new external datasets which are captured with a different camera under different environment. |
Tasks | |
Published | 2016-03-24 |
URL | http://arxiv.org/abs/1603.07475v1 |
http://arxiv.org/pdf/1603.07475v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-scale-surface-normal-estimation-using-a |
Repo | |
Framework | |
A System for Probabilistic Linking of Thesauri and Classification Systems
Title | A System for Probabilistic Linking of Thesauri and Classification Systems |
Authors | Lisa Posch, Philipp Schaer, Arnim Bleier, Markus Strohmaier |
Abstract | This paper presents a system which creates and visualizes probabilistic semantic links between concepts in a thesaurus and classes in a classification system. For creating the links, we build on the Polylingual Labeled Topic Model (PLL-TM). PLL-TM identifies probable thesaurus descriptors for each class in the classification system by using information from the natural language text of documents, their assigned thesaurus descriptors and their designated classes. The links are then presented to users of the system in an interactive visualization, providing them with an automatically generated overview of the relations between the thesaurus and the classification system. |
Tasks | |
Published | 2016-03-21 |
URL | http://arxiv.org/abs/1603.06485v1 |
http://arxiv.org/pdf/1603.06485v1.pdf | |
PWC | https://paperswithcode.com/paper/a-system-for-probabilistic-linking-of |
Repo | |
Framework | |
Resource Planning For Rescue Operations
Title | Resource Planning For Rescue Operations |
Authors | Mona Khaffaf, Arshia Khaffaf |
Abstract | After an earthquake, disaster sites pose a multitude of health and safety concerns. A rescue operation of people trapped in the ruins after an earthquake disaster requires a series of intelligent behavior, including planning. For a successful rescue operation, given a limited number of available actions and regulations, the role of planning in rescue operations is crucial. Fortunately, recent developments in automated planning by artificial intelligence community can help different organization in this crucial task. Due to the number of rules and regulations, we believe that a rule based system for planning can be helpful for this specific planning problem. In this research work, we use logic rules to represent rescue and related regular regulations, together with a logic based planner to solve this complicated problem. Although this research is still in the prototyping and modeling stage, it clearly shows that rule based languages can be a good infrastructure for this computational task. The results of this research can be used by different organizations, such as Iranian Red Crescent Society and International Institute of Seismology and Earthquake Engineering (IISEE). |
Tasks | |
Published | 2016-07-14 |
URL | http://arxiv.org/abs/1607.03979v1 |
http://arxiv.org/pdf/1607.03979v1.pdf | |
PWC | https://paperswithcode.com/paper/resource-planning-for-rescue-operations |
Repo | |
Framework | |
Transitive Hashing Network for Heterogeneous Multimedia Retrieval
Title | Transitive Hashing Network for Heterogeneous Multimedia Retrieval |
Authors | Zhangjie Cao, Mingsheng Long, Qiang Yang |
Abstract | Hashing has been widely applied to large-scale multimedia retrieval due to the storage and retrieval efficiency. Cross-modal hashing enables efficient retrieval from database of one modality in response to a query of another modality. Existing work on cross-modal hashing assumes heterogeneous relationship across modalities for hash function learning. In this paper, we relax the strong assumption by only requiring such heterogeneous relationship in an auxiliary dataset different from the query/database domain. We craft a hybrid deep architecture to simultaneously learn the cross-modal correlation from the auxiliary dataset, and align the dataset distributions between the auxiliary dataset and the query/database domain, which generates transitive hash codes for heterogeneous multimedia retrieval. Extensive experiments exhibit that the proposed approach yields state of the art multimedia retrieval performance on public datasets, i.e. NUS-WIDE, ImageNet-YahooQA. |
Tasks | |
Published | 2016-08-15 |
URL | http://arxiv.org/abs/1608.04307v1 |
http://arxiv.org/pdf/1608.04307v1.pdf | |
PWC | https://paperswithcode.com/paper/transitive-hashing-network-for-heterogeneous |
Repo | |
Framework | |
Grounded Lexicon Acquisition - Case Studies in Spatial Language
Title | Grounded Lexicon Acquisition - Case Studies in Spatial Language |
Authors | Michael Spranger |
Abstract | This paper discusses grounded acquisition experiments of increasing complexity. Humanoid robots acquire English spatial lexicons from robot tutors. We identify how various spatial language systems, such as projective, absolute and proximal can be learned. The proposed learning mechanisms do not rely on direct meaning transfer or direct access to world models of interlocutors. Finally, we show how multiple systems can be acquired at the same time. |
Tasks | |
Published | 2016-07-26 |
URL | http://arxiv.org/abs/1607.07630v1 |
http://arxiv.org/pdf/1607.07630v1.pdf | |
PWC | https://paperswithcode.com/paper/grounded-lexicon-acquisition-case-studies-in |
Repo | |
Framework | |
One-Class Slab Support Vector Machine
Title | One-Class Slab Support Vector Machine |
Authors | Victor Fragoso, Walter Scheirer, Joao Hespanha, Matthew Turk |
Abstract | This work introduces the one-class slab SVM (OCSSVM), a one-class classifier that aims at improving the performance of the one-class SVM. The proposed strategy reduces the false positive rate and increases the accuracy of detecting instances from novel classes. To this end, it uses two parallel hyperplanes to learn the normal region of the decision scores of the target class. OCSSVM extends one-class SVM since it can scale and learn non-linear decision functions via kernel methods. The experiments on two publicly available datasets show that OCSSVM can consistently outperform the one-class SVM and perform comparable to or better than other state-of-the-art one-class classifiers. |
Tasks | One-class classifier |
Published | 2016-08-02 |
URL | http://arxiv.org/abs/1608.01026v1 |
http://arxiv.org/pdf/1608.01026v1.pdf | |
PWC | https://paperswithcode.com/paper/one-class-slab-support-vector-machine |
Repo | |
Framework | |
Generalized Mirror Descents in Congestion Games
Title | Generalized Mirror Descents in Congestion Games |
Authors | Po-An Chen, Chi-Jen Lu |
Abstract | Different types of dynamics have been studied in repeated game play, and one of them which has received much attention recently consists of those based on “no-regret” algorithms from the area of machine learning. It is known that dynamics based on generic no-regret algorithms may not converge to Nash equilibria in general, but to a larger set of outcomes, namely coarse correlated equilibria. Moreover, convergence results based on generic no-regret algorithms typically use a weaker notion of convergence: the convergence of the average plays instead of the actual plays. Some work has been done showing that when using a specific no-regret algorithm, the well-known multiplicative updates algorithm, convergence of actual plays to equilibria can be shown and better quality of outcomes in terms of the price of anarchy can be reached for atomic congestion games and load balancing games. Are there more cases of natural no-regret dynamics that perform well in suitable classes of games in terms of convergence and quality of outcomes that the dynamics converge to? We answer this question positively in the bulletin-board model by showing that when employing the mirror-descent algorithm, a well-known generic no-regret algorithm, the actual plays converge quickly to equilibria in nonatomic congestion games. Furthermore, the bandit model considers a probably more realistic and prevalent setting with only partial information, in which at each time step each player only knows the cost of her own currently played strategy, but not any costs of unplayed strategies. For the class of atomic congestion games, we propose a family of bandit algorithms based on the mirror-descent algorithms previously presented, and show that when each player individually adopts such a bandit algorithm, their joint (mixed) strategy profile quickly converges with implications. |
Tasks | |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.07774v2 |
http://arxiv.org/pdf/1605.07774v2.pdf | |
PWC | https://paperswithcode.com/paper/generalized-mirror-descents-in-congestion |
Repo | |
Framework | |
Joint Dimensionality Reduction for Two Feature Vectors
Title | Joint Dimensionality Reduction for Two Feature Vectors |
Authors | Yanjun Li, Yoram Bresler |
Abstract | Many machine learning problems, especially multi-modal learning problems, have two sets of distinct features (e.g., image and text features in news story classification, or neuroimaging data and neurocognitive data in cognitive science research). This paper addresses the joint dimensionality reduction of two feature vectors in supervised learning problems. In particular, we assume a discriminative model where low-dimensional linear embeddings of the two feature vectors are sufficient statistics for predicting a dependent variable. We show that a simple algorithm involving singular value decomposition can accurately estimate the embeddings provided that certain sample complexities are satisfied, without specifying the nonlinear link function (regressor or classifier). The main results establish sample complexities under multiple settings. Sample complexities for different link functions only differ by constant factors. |
Tasks | Dimensionality Reduction |
Published | 2016-02-13 |
URL | http://arxiv.org/abs/1602.04398v3 |
http://arxiv.org/pdf/1602.04398v3.pdf | |
PWC | https://paperswithcode.com/paper/joint-dimensionality-reduction-for-two |
Repo | |
Framework | |
Real-Time Facial Segmentation and Performance Capture from RGB Input
Title | Real-Time Facial Segmentation and Performance Capture from RGB Input |
Authors | Shunsuke Saito, Tianye Li, Hao Li |
Abstract | We introduce the concept of unconstrained real-time 3D facial performance capture through explicit semantic segmentation in the RGB input. To ensure robustness, cutting edge supervised learning approaches rely on large training datasets of face images captured in the wild. While impressive tracking quality has been demonstrated for faces that are largely visible, any occlusion due to hair, accessories, or hand-to-face gestures would result in significant visual artifacts and loss of tracking accuracy. The modeling of occlusions has been mostly avoided due to its immense space of appearance variability. To address this curse of high dimensionality, we perform tracking in unconstrained images assuming non-face regions can be fully masked out. Along with recent breakthroughs in deep learning, we demonstrate that pixel-level facial segmentation is possible in real-time by repurposing convolutional neural networks designed originally for general semantic segmentation. We develop an efficient architecture based on a two-stream deconvolution network with complementary characteristics, and introduce carefully designed training samples and data augmentation strategies for improved segmentation accuracy and robustness. We adopt a state-of-the-art regression-based facial tracking framework with segmented face images as training, and demonstrate accurate and uninterrupted facial performance capture in the presence of extreme occlusion and even side views. Furthermore, the resulting segmentation can be directly used to composite partial 3D face models on the input images and enable seamless facial manipulation tasks, such as virtual make-up or face replacement. |
Tasks | Data Augmentation, Semantic Segmentation |
Published | 2016-04-10 |
URL | http://arxiv.org/abs/1604.02647v1 |
http://arxiv.org/pdf/1604.02647v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-facial-segmentation-and-performance |
Repo | |
Framework | |
Urban MV and LV Distribution Grid Topology Estimation via Group Lasso
Title | Urban MV and LV Distribution Grid Topology Estimation via Group Lasso |
Authors | Yizheng Liao, Yang Weng, Guangyi Liu, Ram Rajagopal |
Abstract | The increasing penetration of distributed energy resources poses numerous reliability issues to the urban distribution grid. The topology estimation is a critical step to ensure the robustness of distribution grid operation. However, the bus connectivity and grid topology estimation are usually hard in distribution grids. For example, it is technically challenging and costly to monitor the bus connectivity in urban grids, e.g., underground lines. It is also inappropriate to use the radial topology assumption exclusively because the grids of metropolitan cities and regions with dense loads could be with many mesh structures. To resolve these drawbacks, we propose a data-driven topology estimation method for MV and LV distribution grids by only utilizing the historical smart meter measurements. Particularly, a probabilistic graphical model is utilized to capture the statistical dependencies amongst bus voltages. We prove that the bus connectivity and grid topology estimation problems, in radial and mesh structures, can be formulated as a linear regression with a least absolute shrinkage regularization on grouped variables (\textit{group lasso}). Simulations show highly accurate results in eight MV and LV distribution networks at different sizes and 22 topology configurations using PG&E residential smart meter data. |
Tasks | |
Published | 2016-11-06 |
URL | http://arxiv.org/abs/1611.01845v2 |
http://arxiv.org/pdf/1611.01845v2.pdf | |
PWC | https://paperswithcode.com/paper/urban-mv-and-lv-distribution-grid-topology |
Repo | |
Framework | |
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection
Title | Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection |
Authors | Sergey Levine, Peter Pastor, Alex Krizhevsky, Deirdre Quillen |
Abstract | We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independently of camera calibration or the current robot pose. This requires the network to observe the spatial relationship between the gripper and objects in the scene, thus learning hand-eye coordination. We then use this network to servo the gripper in real time to achieve successful grasps. To train our network, we collected over 800,000 grasp attempts over the course of two months, using between 6 and 14 robotic manipulators at any given time, with differences in camera placement and hardware. Our experimental evaluation demonstrates that our method achieves effective real-time control, can successfully grasp novel objects, and corrects mistakes by continuous servoing. |
Tasks | Calibration, Robotic Grasping |
Published | 2016-03-07 |
URL | http://arxiv.org/abs/1603.02199v4 |
http://arxiv.org/pdf/1603.02199v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-hand-eye-coordination-for-robotic |
Repo | |
Framework | |