February 2, 2020

3061 words 15 mins read

Paper Group AWR 55

Paper Group AWR 55

Probabilistic Residual Learning for Aleatoric Uncertainty in Image Restoration. GolfDB: A Video Database for Golf Swing Sequencing. Street Crossing Aid Using Light-weight CNNs for the Visually Impaired. Batch Policy Learning under Constraints. Green AI. Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolu …

Probabilistic Residual Learning for Aleatoric Uncertainty in Image Restoration

Title Probabilistic Residual Learning for Aleatoric Uncertainty in Image Restoration
Authors Chen Zhang, Bangti Jin
Abstract Aleatoric uncertainty is an intrinsic property of ill-posed inverse and imaging problems. Its quantification is vital for assessing the reliability of relevant point estimates. In this paper, we propose an efficient framework for quantifying aleatoric uncertainty for deep residual learning and showcase its significant potential on image restoration. In the framework, we divide the conditional probability modeling for the residual variable into a deterministic homo-dimensional level, a stochastic low-dimensional level and a merging level. The low-dimensionality is especially suitable for sparse correlation between image pixels, enables efficient sampling for high dimensional problems and acts as a regularizer for the distribution. Preliminary numerical experiments show that the proposed method can give not only state-of-the-art point estimates of image restoration but also useful associated uncertainty information.
Tasks Image Restoration
Published 2019-08-01
URL https://arxiv.org/abs/1908.01010v2
PDF https://arxiv.org/pdf/1908.01010v2.pdf
PWC https://paperswithcode.com/paper/probabilistic-residual-learning-for-aleatoric
Repo https://github.com/chenzxyz/prob_res_learning
Framework tf

GolfDB: A Video Database for Golf Swing Sequencing

Title GolfDB: A Video Database for Golf Swing Sequencing
Authors William McNally, Kanav Vats, Tyler Pinto, Chris Dulhanty, John McPhee, Alexander Wong
Abstract The golf swing is a complex movement requiring considerable full-body coordination to execute proficiently. As such, it is the subject of frequent scrutiny and extensive biomechanical analyses. In this paper, we introduce the notion of golf swing sequencing for detecting key events in the golf swing and facilitating golf swing analysis. To enable consistent evaluation of golf swing sequencing performance, we also introduce the benchmark database GolfDB, consisting of 1400 high-quality golf swing videos, each labeled with event frames, bounding box, player name and sex, club type, and view type. Furthermore, to act as a reference baseline for evaluating golf swing sequencing performance on GolfDB, we propose a lightweight deep neural network called SwingNet, which possesses a hybrid deep convolutional and recurrent neural network architecture. SwingNet correctly detects eight golf swing events at an average rate of 76.1%, and six out of eight events at a rate of 91.8%. In line with the proposed baseline SwingNet, we advocate the use of computationally efficient models in future research to promote in-the-field analysis via deployment on readily-available mobile devices.
Tasks
Published 2019-03-15
URL http://arxiv.org/abs/1903.06528v1
PDF http://arxiv.org/pdf/1903.06528v1.pdf
PWC https://paperswithcode.com/paper/golfdb-a-video-database-for-golf-swing
Repo https://github.com/wmcnally/GolfDB
Framework pytorch

Street Crossing Aid Using Light-weight CNNs for the Visually Impaired

Title Street Crossing Aid Using Light-weight CNNs for the Visually Impaired
Authors Samuel Yu, Heon Lee, Jung Hoon Kim
Abstract In this paper, we address an issue that the visually impaired commonly face while crossing intersections and propose a solution that takes form as a mobile application. The application utilizes a deep learning convolutional neural network model, LytNetV2, to output necessary information that the visually impaired may lack when without human companions or guide-dogs. A prototype of the application runs on iOS devices of versions 11 or above. It is designed for comprehensiveness, concision, accuracy, and computational efficiency through delivering the two most important pieces of information, pedestrian traffic light color and direction, required to cross the road in real-time. Furthermore, it is specifically aimed to support those facing financial burden as the solution takes the form of a free mobile application. Through the modification and utilization of key principles in MobileNetV3 such as depthwise seperable convolutions and squeeze-excite layers, the deep neural network model achieves a classification accuracy of 96% and average angle error of 6.15 degrees, while running at a frame rate of 16.34 frames per second. Additionally, the model is trained as an image classifier, allowing for a faster and more accurate model. The network is able to outperform other methods such as object detection and non-deep learning algorithms in both accuracy and thoroughness. The information is delivered through both auditory signals and vibrations, and it has been tested on seven visually impaired and has received above satisfactory responses.
Tasks Object Detection
Published 2019-09-14
URL https://arxiv.org/abs/1909.09598v1
PDF https://arxiv.org/pdf/1909.09598v1.pdf
PWC https://paperswithcode.com/paper/street-crossing-aid-using-light-weight-cnns
Repo https://github.com/samuelyu2002/pedestrian-traffic-lights
Framework pytorch

Batch Policy Learning under Constraints

Title Batch Policy Learning under Constraints
Authors Hoang M. Le, Cameron Voloshin, Yisong Yue
Abstract When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. To certify constraint satisfaction, we propose a new and simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to multiple constraints such as lane keeping and smooth driving. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting.
Tasks
Published 2019-03-20
URL http://arxiv.org/abs/1903.08738v1
PDF http://arxiv.org/pdf/1903.08738v1.pdf
PWC https://paperswithcode.com/paper/batch-policy-learning-under-constraints
Repo https://github.com/clvoloshin/constrained_batch_policy_learning
Framework none

Green AI

Title Green AI
Authors Roy Schwartz, Jesse Dodge, Noah A. Smith, Oren Etzioni
Abstract The computations required for deep learning research have been doubling every few months, resulting in an estimated 300,000x increase from 2012 to 2018 [2]. These computations have a surprisingly large carbon footprint [38]. Ironically, deep learning was inspired by the human brain, which is remarkably energy efficient. Moreover, the financial cost of the computations can make it difficult for academics, students, and researchers, in particular those from emerging economies, to engage in deep learning research. This position paper advocates a practical solution by making efficiency an evaluation criterion for research alongside accuracy and related measures. In addition, we propose reporting the financial cost or “price tag” of developing, training, and running models to provide baselines for the investigation of increasingly efficient methods. Our goal is to make AI both greener and more inclusive—enabling any inspired undergraduate with a laptop to write high-quality research papers. Green AI is an emerging focus at the Allen Institute for AI.
Tasks
Published 2019-07-22
URL https://arxiv.org/abs/1907.10597v3
PDF https://arxiv.org/pdf/1907.10597v3.pdf
PWC https://paperswithcode.com/paper/green-ai
Repo https://github.com/sagabanana/-60daysofudacity
Framework pytorch

Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolutional neural networks

Title Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolutional neural networks
Authors Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, Pierre-Alain Muller
Abstract Purpose: Manual feedback from senior surgeons observing less experienced trainees is a laborious task that is very expensive, time-consuming and prone to subjectivity. With the number of surgical procedures increasing annually, there is an unprecedented need to provide an accurate, objective and automatic evaluation of trainees’ surgical skills in order to improve surgical practice. Methods: In this paper, we designed a convolutional neural network (CNN) to classify surgical skills by extracting latent patterns in the trainees’ motions performed during robotic surgery. The method is validated on the JIGSAWS dataset for two surgical skills evaluation tasks: classification and regression. Results: Our results show that deep neural networks constitute robust machine learning models that are able to reach new competitive state-of-the-art performance on the JIGSAWS dataset. While we leveraged from CNNs’ efficiency, we were able to minimize its black-box effect using the class activation map technique. Conclusions: This characteristic allowed our method to automatically pinpoint which parts of the surgery influenced the skill evaluation the most, thus allowing us to explain a surgical skill classification and provide surgeons with a novel personalized feedback technique. We believe this type of interpretable machine learning model could integrate within “Operation Room 2.0” and support novice surgeons in improving their skills to eventually become experts.
Tasks Interpretable Machine Learning, Surgical Skills Evaluation
Published 2019-08-20
URL https://arxiv.org/abs/1908.07319v1
PDF https://arxiv.org/pdf/1908.07319v1.pdf
PWC https://paperswithcode.com/paper/accurate-and-interpretable-evaluation-of
Repo https://github.com/hfawaz/ijcars19
Framework none

Who Blames Whom in a Crisis? Detecting Blame Ties from News Articles Using Neural Networks

Title Who Blames Whom in a Crisis? Detecting Blame Ties from News Articles Using Neural Networks
Authors Shuailong Liang, Olivia Nicol, Yue Zhang
Abstract Blame games tend to follow major disruptions, be they financial crises, natural disasters or terrorist attacks. To study how the blame game evolves and shapes the dominant crisis narratives is of great significance, as sense-making processes can affect regulatory outcomes, social hierarchies, and cultural norms. However, it takes tremendous time and efforts for social scientists to manually examine each relevant news article and extract the blame ties (A blames B). In this study, we define a new task, Blame Tie Extraction, and construct a new dataset related to the United States financial crisis (2007-2010) from The New York Times, The Wall Street Journal and USA Today. We build a Bi-directional Long Short-Term Memory (BiLSTM) network for contexts where the entities appear in and it learns to automatically extract such blame ties at the document level. Leveraging the large unsupervised model such as GloVe and ELMo, our best model achieves an F1 score of 70% on the test set for blame tie extraction, making it a useful tool for social scientists to extract blame ties more efficiently.
Tasks
Published 2019-04-24
URL http://arxiv.org/abs/1904.10637v1
PDF http://arxiv.org/pdf/1904.10637v1.pdf
PWC https://paperswithcode.com/paper/who-blames-whom-in-a-crisis-detecting-blame
Repo https://github.com/Shuailong/BlamePipeline
Framework pytorch

Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN)

Title Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN)
Authors Petru Soviany, Claudiu Ardei, Radu Tudor Ionescu, Marius Leordeanu
Abstract Despite the significant advances in recent years, Generative Adversarial Networks (GANs) are still notoriously hard to train. In this paper, we propose three novel curriculum learning strategies for training GANs. All strategies are first based on ranking the training images by their difficulty scores, which are estimated by a state-of-the-art image difficulty predictor. Our first strategy is to divide images into gradually more difficult batches. Our second strategy introduces a novel curriculum loss function for the discriminator that takes into account the difficulty scores of the real images. Our third strategy is based on sampling from an evolving distribution, which favors the easier images during the initial training stages and gradually converges to a uniform distribution, in which samples are equally likely, regardless of difficulty. We compare our curriculum learning strategies with the classic training procedure on two tasks: image generation and image translation. Our experiments indicate that all strategies provide faster convergence and superior results. For example, our best curriculum learning strategy applied on spectrally normalized GANs (SNGANs) fooled human annotators in thinking that generated CIFAR-like images are real in 25.0% of the presented cases, while the SNGANs trained using the classic procedure fooled the annotators in only 18.4% cases. Similarly, in image translation, the human annotators preferred the images produced by the Cycle-consistent GAN (CycleGAN) trained using curriculum learning in 40.5% cases and those produced by CycleGAN based on classic training in only 19.8% cases, 39.7% cases being labeled as ties.
Tasks Image Generation
Published 2019-10-20
URL https://arxiv.org/abs/1910.08967v2
PDF https://arxiv.org/pdf/1910.08967v2.pdf
PWC https://paperswithcode.com/paper/image-difficulty-curriculum-for-generative
Repo https://github.com/pittyacg/CurriculumSNGAN
Framework none

Interpretable Generative Neural Spatio-Temporal Point Processes

Title Interpretable Generative Neural Spatio-Temporal Point Processes
Authors Shixiang Zhu, Shuang Li, Yao Xie
Abstract We present a novel generative model for spatio-temporal correlated discrete event data. Despite the rapid development of one-dimensional point processes for temporal event data, the study of how to model spatial aspects of such discrete event data is scarce. Our proposed Neural Embedding Spatio-Temporal (NEST) point process is a probabilistic generative model, which captures complex spatial influence, by carefully combining statistical models with flexible neural networks with spatial information embedding. NEST also enjoys computational complexity, high-interpretability, and strong expressive capacity for complex spatio-temporal dependency. We present two computationally efficient approaches based on maximum likelihood and imitation learning, which is robust to model mismatch. Experiments based on real data show the superior performance of our method relative to the state-of-the-art.
Tasks Imitation Learning, Point Processes
Published 2019-06-13
URL https://arxiv.org/abs/1906.05467v2
PDF https://arxiv.org/pdf/1906.05467v2.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-of-spatio-temporal
Repo https://github.com/meowoodie/Spatio-Temporal-Point-Process-with-Gaussian-Mixture-Diffusion-Kernel
Framework tf

COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration

Title COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration
Authors Nicholas Watters, Loic Matthey, Matko Bosnjak, Christopher P. Burgess, Alexander Lerchner
Abstract Data efficiency and robustness to task-irrelevant perturbations are long-standing challenges for deep reinforcement learning algorithms. Here we introduce a modular approach to addressing these challenges in a continuous control environment, without using hand-crafted or supervised information. Our Curious Object-Based seaRch Agent (COBRA) uses task-free intrinsically motivated exploration and unsupervised learning to build object-based models of its environment and action space. Subsequently, it can learn a variety of tasks through model-based search in very few steps and excel on structured hold-out tests of policy robustness.
Tasks Continuous Control
Published 2019-05-22
URL https://arxiv.org/abs/1905.09275v2
PDF https://arxiv.org/pdf/1905.09275v2.pdf
PWC https://paperswithcode.com/paper/cobra-data-efficient-model-based-rl-through
Repo https://github.com/deepmind/spriteworld
Framework none

VINE: Visualizing Statistical Interactions in Black Box Models

Title VINE: Visualizing Statistical Interactions in Black Box Models
Authors Matthew Britton
Abstract As machine learning becomes more pervasive, there is an urgent need for interpretable explanations of predictive models. Prior work has developed effective methods for visualizing global model behavior, as well as generating local (instance-specific) explanations. However, relatively little work has addressed regional explanations - how groups of similar instances behave in a complex model, and the related issue of visualizing statistical feature interactions. The lack of utilities available for these analytical needs hinders the development of models that are mission-critical, transparent, and align with social goals. We present VINE (Visual INteraction Effects), a novel algorithm to extract and visualize statistical interaction effects in black box models. We also present a novel evaluation metric for visualizations in the interpretable ML space.
Tasks
Published 2019-04-01
URL http://arxiv.org/abs/1904.00561v1
PDF http://arxiv.org/pdf/1904.00561v1.pdf
PWC https://paperswithcode.com/paper/vine-visualizing-statistical-interactions-in
Repo https://github.com/MattJBritton/VINE
Framework none

Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras

Title Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras
Authors Ariel Gordon, Hanhan Li, Rico Jonschkowski, Anelia Angelova
Abstract We present a novel method for simultaneous learning of depth, egomotion, object motion, and camera intrinsics from monocular videos, using only consistency across neighboring video frames as supervision signal. Similarly to prior work, our method learns by applying differentiable warping to frames and comparing the result to adjacent ones, but it provides several improvements: We address occlusions geometrically and differentiably, directly using the depth maps as predicted during training. We introduce randomized layer normalization, a novel powerful regularizer, and we account for object motion relative to the scene. To the best of our knowledge, our work is the first to learn the camera intrinsic parameters, including lens distortion, from video in an unsupervised manner, thereby allowing us to extract accurate depth and motion from arbitrary videos of unknown origin at scale. We evaluate our results on the Cityscapes, KITTI and EuRoC datasets, establishing new state of the art on depth prediction and odometry, and demonstrate qualitatively that depth prediction can be learned from a collection of YouTube videos.
Tasks Depth Estimation
Published 2019-04-10
URL http://arxiv.org/abs/1904.04998v1
PDF http://arxiv.org/pdf/1904.04998v1.pdf
PWC https://paperswithcode.com/paper/depth-from-videos-in-the-wild-unsupervised
Repo https://github.com/robot-love/depth_from_video_in_the_wild
Framework tf

Formal derivation of Mesh Neural Networks with their Forward-Only gradient Propagation

Title Formal derivation of Mesh Neural Networks with their Forward-Only gradient Propagation
Authors Federico A. Galatolo, Mario G. C. A. Cimino, Gigliola Vaglini
Abstract This paper proposes the Mesh Neural Network (MNN), a novel architecture which allows neurons to be connected in any topology, to efficiently route information. In MNNs, information is propagated between neurons throughout a state transition function. State and error gradients are then directly computed from state updates without backward computation. The MNN architecture and the error propagation schema is formalized and derived in tensor algebra. The proposed computational model can fully supply a gradient descent process, and is suitable for very large scale NNs, due to its expressivity and training efficiency, with respect to NNs based on back-propagation and computational graphs.
Tasks
Published 2019-05-16
URL https://arxiv.org/abs/1905.06684v3
PDF https://arxiv.org/pdf/1905.06684v3.pdf
PWC https://paperswithcode.com/paper/formal-derivation-of-mesh-neural-networks
Repo https://github.com/galatolofederico/mesh-neural-networks
Framework none

Probabilistic Formulation of the Take The Best Heuristic

Title Probabilistic Formulation of the Take The Best Heuristic
Authors Tomi Peltola, Jussi Jokinen, Samuel Kaski
Abstract The framework of cognitively bounded rationality treats problem solving as fundamentally rational, but emphasises that it is constrained by cognitive architecture and the task environment. This paper investigates a simple decision making heuristic, Take The Best (TTB), within that framework. We formulate TTB as a likelihood-based probabilistic model, where the decision strategy arises by probabilistic inference based on the training data and the model constraints. The strengths of the probabilistic formulation, in addition to providing a bounded rational account of the learning of the heuristic, include natural extensibility with additional cognitively plausible constraints and prior information, and the possibility to embed the heuristic as a subpart of a larger probabilistic model. We extend the model to learn cue discrimination thresholds for continuous-valued cues and experiment with using the model to account for biased preference feedback from a boundedly rational agent in a simulated interactive machine learning task.
Tasks Decision Making
Published 2019-11-01
URL https://arxiv.org/abs/1911.00572v1
PDF https://arxiv.org/pdf/1911.00572v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-formulation-of-the-take-the
Repo https://github.com/to-mi/pttb
Framework none

Object-Contextual Representations for Semantic Segmentation

Title Object-Contextual Representations for Semantic Segmentation
Authors Yuhui Yuan, Xilin Chen, Jingdong Wang
Abstract In this paper, we address the semantic segmentation problem with a focus on the context aggregation strategy. Our motivation is that the label of a pixel is the category of the object that the pixel belongs to. We present a simple yet effective approach, object-contextual representations, characterizing a pixel by exploiting the representation of the corresponding object class. First, we learn object regions under the supervision of the ground-truth segmentation. Second, we compute the object region representation by aggregating the representations of the pixels lying in the object region. Last, % the representation similarity we compute the relation between each pixel and each object region, and augment the representation of each pixel with the object-contextual representation which is a weighted aggregation of all the object region representations according to their relations with the pixel. We empirically demonstrate that the proposed approach achieves competitive performance on various challenging semantic segmentation benchmarks: Cityscapes, ADE20K, LIP, PASCAL-Context, and COCO-Stuff.
Tasks Semantic Segmentation
Published 2019-09-24
URL https://arxiv.org/abs/1909.11065v2
PDF https://arxiv.org/pdf/1909.11065v2.pdf
PWC https://paperswithcode.com/paper/object-contextual-representations-for
Repo https://github.com/PkuRainBow/OCNet
Framework pytorch
comments powered by Disqus