Paper Group AWR 55
Probabilistic Residual Learning for Aleatoric Uncertainty in Image Restoration. GolfDB: A Video Database for Golf Swing Sequencing. Street Crossing Aid Using Light-weight CNNs for the Visually Impaired. Batch Policy Learning under Constraints. Green AI. Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolu …
Probabilistic Residual Learning for Aleatoric Uncertainty in Image Restoration
Title | Probabilistic Residual Learning for Aleatoric Uncertainty in Image Restoration |
Authors | Chen Zhang, Bangti Jin |
Abstract | Aleatoric uncertainty is an intrinsic property of ill-posed inverse and imaging problems. Its quantification is vital for assessing the reliability of relevant point estimates. In this paper, we propose an efficient framework for quantifying aleatoric uncertainty for deep residual learning and showcase its significant potential on image restoration. In the framework, we divide the conditional probability modeling for the residual variable into a deterministic homo-dimensional level, a stochastic low-dimensional level and a merging level. The low-dimensionality is especially suitable for sparse correlation between image pixels, enables efficient sampling for high dimensional problems and acts as a regularizer for the distribution. Preliminary numerical experiments show that the proposed method can give not only state-of-the-art point estimates of image restoration but also useful associated uncertainty information. |
Tasks | Image Restoration |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.01010v2 |
https://arxiv.org/pdf/1908.01010v2.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-residual-learning-for-aleatoric |
Repo | https://github.com/chenzxyz/prob_res_learning |
Framework | tf |
GolfDB: A Video Database for Golf Swing Sequencing
Title | GolfDB: A Video Database for Golf Swing Sequencing |
Authors | William McNally, Kanav Vats, Tyler Pinto, Chris Dulhanty, John McPhee, Alexander Wong |
Abstract | The golf swing is a complex movement requiring considerable full-body coordination to execute proficiently. As such, it is the subject of frequent scrutiny and extensive biomechanical analyses. In this paper, we introduce the notion of golf swing sequencing for detecting key events in the golf swing and facilitating golf swing analysis. To enable consistent evaluation of golf swing sequencing performance, we also introduce the benchmark database GolfDB, consisting of 1400 high-quality golf swing videos, each labeled with event frames, bounding box, player name and sex, club type, and view type. Furthermore, to act as a reference baseline for evaluating golf swing sequencing performance on GolfDB, we propose a lightweight deep neural network called SwingNet, which possesses a hybrid deep convolutional and recurrent neural network architecture. SwingNet correctly detects eight golf swing events at an average rate of 76.1%, and six out of eight events at a rate of 91.8%. In line with the proposed baseline SwingNet, we advocate the use of computationally efficient models in future research to promote in-the-field analysis via deployment on readily-available mobile devices. |
Tasks | |
Published | 2019-03-15 |
URL | http://arxiv.org/abs/1903.06528v1 |
http://arxiv.org/pdf/1903.06528v1.pdf | |
PWC | https://paperswithcode.com/paper/golfdb-a-video-database-for-golf-swing |
Repo | https://github.com/wmcnally/GolfDB |
Framework | pytorch |
Street Crossing Aid Using Light-weight CNNs for the Visually Impaired
Title | Street Crossing Aid Using Light-weight CNNs for the Visually Impaired |
Authors | Samuel Yu, Heon Lee, Jung Hoon Kim |
Abstract | In this paper, we address an issue that the visually impaired commonly face while crossing intersections and propose a solution that takes form as a mobile application. The application utilizes a deep learning convolutional neural network model, LytNetV2, to output necessary information that the visually impaired may lack when without human companions or guide-dogs. A prototype of the application runs on iOS devices of versions 11 or above. It is designed for comprehensiveness, concision, accuracy, and computational efficiency through delivering the two most important pieces of information, pedestrian traffic light color and direction, required to cross the road in real-time. Furthermore, it is specifically aimed to support those facing financial burden as the solution takes the form of a free mobile application. Through the modification and utilization of key principles in MobileNetV3 such as depthwise seperable convolutions and squeeze-excite layers, the deep neural network model achieves a classification accuracy of 96% and average angle error of 6.15 degrees, while running at a frame rate of 16.34 frames per second. Additionally, the model is trained as an image classifier, allowing for a faster and more accurate model. The network is able to outperform other methods such as object detection and non-deep learning algorithms in both accuracy and thoroughness. The information is delivered through both auditory signals and vibrations, and it has been tested on seven visually impaired and has received above satisfactory responses. |
Tasks | Object Detection |
Published | 2019-09-14 |
URL | https://arxiv.org/abs/1909.09598v1 |
https://arxiv.org/pdf/1909.09598v1.pdf | |
PWC | https://paperswithcode.com/paper/street-crossing-aid-using-light-weight-cnns |
Repo | https://github.com/samuelyu2002/pedestrian-traffic-lights |
Framework | pytorch |
Batch Policy Learning under Constraints
Title | Batch Policy Learning under Constraints |
Authors | Hoang M. Le, Cameron Voloshin, Yisong Yue |
Abstract | When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints. We thus study the problem of batch policy learning under multiple constraints, and offer a systematic solution. We first propose a flexible meta-algorithm that admits any batch reinforcement learning and online learning procedure as subroutines. We then present a specific algorithmic instantiation and provide performance guarantees for the main objective and all constraints. To certify constraint satisfaction, we propose a new and simple method for off-policy policy evaluation (OPE) and derive PAC-style bounds. Our algorithm achieves strong empirical results in different domains, including in a challenging problem of simulated car driving subject to multiple constraints such as lane keeping and smooth driving. We also show experimentally that our OPE method outperforms other popular OPE techniques on a standalone basis, especially in a high-dimensional setting. |
Tasks | |
Published | 2019-03-20 |
URL | http://arxiv.org/abs/1903.08738v1 |
http://arxiv.org/pdf/1903.08738v1.pdf | |
PWC | https://paperswithcode.com/paper/batch-policy-learning-under-constraints |
Repo | https://github.com/clvoloshin/constrained_batch_policy_learning |
Framework | none |
Green AI
Title | Green AI |
Authors | Roy Schwartz, Jesse Dodge, Noah A. Smith, Oren Etzioni |
Abstract | The computations required for deep learning research have been doubling every few months, resulting in an estimated 300,000x increase from 2012 to 2018 [2]. These computations have a surprisingly large carbon footprint [38]. Ironically, deep learning was inspired by the human brain, which is remarkably energy efficient. Moreover, the financial cost of the computations can make it difficult for academics, students, and researchers, in particular those from emerging economies, to engage in deep learning research. This position paper advocates a practical solution by making efficiency an evaluation criterion for research alongside accuracy and related measures. In addition, we propose reporting the financial cost or “price tag” of developing, training, and running models to provide baselines for the investigation of increasingly efficient methods. Our goal is to make AI both greener and more inclusive—enabling any inspired undergraduate with a laptop to write high-quality research papers. Green AI is an emerging focus at the Allen Institute for AI. |
Tasks | |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.10597v3 |
https://arxiv.org/pdf/1907.10597v3.pdf | |
PWC | https://paperswithcode.com/paper/green-ai |
Repo | https://github.com/sagabanana/-60daysofudacity |
Framework | pytorch |
Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolutional neural networks
Title | Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolutional neural networks |
Authors | Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, Pierre-Alain Muller |
Abstract | Purpose: Manual feedback from senior surgeons observing less experienced trainees is a laborious task that is very expensive, time-consuming and prone to subjectivity. With the number of surgical procedures increasing annually, there is an unprecedented need to provide an accurate, objective and automatic evaluation of trainees’ surgical skills in order to improve surgical practice. Methods: In this paper, we designed a convolutional neural network (CNN) to classify surgical skills by extracting latent patterns in the trainees’ motions performed during robotic surgery. The method is validated on the JIGSAWS dataset for two surgical skills evaluation tasks: classification and regression. Results: Our results show that deep neural networks constitute robust machine learning models that are able to reach new competitive state-of-the-art performance on the JIGSAWS dataset. While we leveraged from CNNs’ efficiency, we were able to minimize its black-box effect using the class activation map technique. Conclusions: This characteristic allowed our method to automatically pinpoint which parts of the surgery influenced the skill evaluation the most, thus allowing us to explain a surgical skill classification and provide surgeons with a novel personalized feedback technique. We believe this type of interpretable machine learning model could integrate within “Operation Room 2.0” and support novice surgeons in improving their skills to eventually become experts. |
Tasks | Interpretable Machine Learning, Surgical Skills Evaluation |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07319v1 |
https://arxiv.org/pdf/1908.07319v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-and-interpretable-evaluation-of |
Repo | https://github.com/hfawaz/ijcars19 |
Framework | none |
Who Blames Whom in a Crisis? Detecting Blame Ties from News Articles Using Neural Networks
Title | Who Blames Whom in a Crisis? Detecting Blame Ties from News Articles Using Neural Networks |
Authors | Shuailong Liang, Olivia Nicol, Yue Zhang |
Abstract | Blame games tend to follow major disruptions, be they financial crises, natural disasters or terrorist attacks. To study how the blame game evolves and shapes the dominant crisis narratives is of great significance, as sense-making processes can affect regulatory outcomes, social hierarchies, and cultural norms. However, it takes tremendous time and efforts for social scientists to manually examine each relevant news article and extract the blame ties (A blames B). In this study, we define a new task, Blame Tie Extraction, and construct a new dataset related to the United States financial crisis (2007-2010) from The New York Times, The Wall Street Journal and USA Today. We build a Bi-directional Long Short-Term Memory (BiLSTM) network for contexts where the entities appear in and it learns to automatically extract such blame ties at the document level. Leveraging the large unsupervised model such as GloVe and ELMo, our best model achieves an F1 score of 70% on the test set for blame tie extraction, making it a useful tool for social scientists to extract blame ties more efficiently. |
Tasks | |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10637v1 |
http://arxiv.org/pdf/1904.10637v1.pdf | |
PWC | https://paperswithcode.com/paper/who-blames-whom-in-a-crisis-detecting-blame |
Repo | https://github.com/Shuailong/BlamePipeline |
Framework | pytorch |
Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN)
Title | Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN) |
Authors | Petru Soviany, Claudiu Ardei, Radu Tudor Ionescu, Marius Leordeanu |
Abstract | Despite the significant advances in recent years, Generative Adversarial Networks (GANs) are still notoriously hard to train. In this paper, we propose three novel curriculum learning strategies for training GANs. All strategies are first based on ranking the training images by their difficulty scores, which are estimated by a state-of-the-art image difficulty predictor. Our first strategy is to divide images into gradually more difficult batches. Our second strategy introduces a novel curriculum loss function for the discriminator that takes into account the difficulty scores of the real images. Our third strategy is based on sampling from an evolving distribution, which favors the easier images during the initial training stages and gradually converges to a uniform distribution, in which samples are equally likely, regardless of difficulty. We compare our curriculum learning strategies with the classic training procedure on two tasks: image generation and image translation. Our experiments indicate that all strategies provide faster convergence and superior results. For example, our best curriculum learning strategy applied on spectrally normalized GANs (SNGANs) fooled human annotators in thinking that generated CIFAR-like images are real in 25.0% of the presented cases, while the SNGANs trained using the classic procedure fooled the annotators in only 18.4% cases. Similarly, in image translation, the human annotators preferred the images produced by the Cycle-consistent GAN (CycleGAN) trained using curriculum learning in 40.5% cases and those produced by CycleGAN based on classic training in only 19.8% cases, 39.7% cases being labeled as ties. |
Tasks | Image Generation |
Published | 2019-10-20 |
URL | https://arxiv.org/abs/1910.08967v2 |
https://arxiv.org/pdf/1910.08967v2.pdf | |
PWC | https://paperswithcode.com/paper/image-difficulty-curriculum-for-generative |
Repo | https://github.com/pittyacg/CurriculumSNGAN |
Framework | none |
Interpretable Generative Neural Spatio-Temporal Point Processes
Title | Interpretable Generative Neural Spatio-Temporal Point Processes |
Authors | Shixiang Zhu, Shuang Li, Yao Xie |
Abstract | We present a novel generative model for spatio-temporal correlated discrete event data. Despite the rapid development of one-dimensional point processes for temporal event data, the study of how to model spatial aspects of such discrete event data is scarce. Our proposed Neural Embedding Spatio-Temporal (NEST) point process is a probabilistic generative model, which captures complex spatial influence, by carefully combining statistical models with flexible neural networks with spatial information embedding. NEST also enjoys computational complexity, high-interpretability, and strong expressive capacity for complex spatio-temporal dependency. We present two computationally efficient approaches based on maximum likelihood and imitation learning, which is robust to model mismatch. Experiments based on real data show the superior performance of our method relative to the state-of-the-art. |
Tasks | Imitation Learning, Point Processes |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05467v2 |
https://arxiv.org/pdf/1906.05467v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-of-spatio-temporal |
Repo | https://github.com/meowoodie/Spatio-Temporal-Point-Process-with-Gaussian-Mixture-Diffusion-Kernel |
Framework | tf |
COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration
Title | COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration |
Authors | Nicholas Watters, Loic Matthey, Matko Bosnjak, Christopher P. Burgess, Alexander Lerchner |
Abstract | Data efficiency and robustness to task-irrelevant perturbations are long-standing challenges for deep reinforcement learning algorithms. Here we introduce a modular approach to addressing these challenges in a continuous control environment, without using hand-crafted or supervised information. Our Curious Object-Based seaRch Agent (COBRA) uses task-free intrinsically motivated exploration and unsupervised learning to build object-based models of its environment and action space. Subsequently, it can learn a variety of tasks through model-based search in very few steps and excel on structured hold-out tests of policy robustness. |
Tasks | Continuous Control |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.09275v2 |
https://arxiv.org/pdf/1905.09275v2.pdf | |
PWC | https://paperswithcode.com/paper/cobra-data-efficient-model-based-rl-through |
Repo | https://github.com/deepmind/spriteworld |
Framework | none |
VINE: Visualizing Statistical Interactions in Black Box Models
Title | VINE: Visualizing Statistical Interactions in Black Box Models |
Authors | Matthew Britton |
Abstract | As machine learning becomes more pervasive, there is an urgent need for interpretable explanations of predictive models. Prior work has developed effective methods for visualizing global model behavior, as well as generating local (instance-specific) explanations. However, relatively little work has addressed regional explanations - how groups of similar instances behave in a complex model, and the related issue of visualizing statistical feature interactions. The lack of utilities available for these analytical needs hinders the development of models that are mission-critical, transparent, and align with social goals. We present VINE (Visual INteraction Effects), a novel algorithm to extract and visualize statistical interaction effects in black box models. We also present a novel evaluation metric for visualizations in the interpretable ML space. |
Tasks | |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00561v1 |
http://arxiv.org/pdf/1904.00561v1.pdf | |
PWC | https://paperswithcode.com/paper/vine-visualizing-statistical-interactions-in |
Repo | https://github.com/MattJBritton/VINE |
Framework | none |
Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras
Title | Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras |
Authors | Ariel Gordon, Hanhan Li, Rico Jonschkowski, Anelia Angelova |
Abstract | We present a novel method for simultaneous learning of depth, egomotion, object motion, and camera intrinsics from monocular videos, using only consistency across neighboring video frames as supervision signal. Similarly to prior work, our method learns by applying differentiable warping to frames and comparing the result to adjacent ones, but it provides several improvements: We address occlusions geometrically and differentiably, directly using the depth maps as predicted during training. We introduce randomized layer normalization, a novel powerful regularizer, and we account for object motion relative to the scene. To the best of our knowledge, our work is the first to learn the camera intrinsic parameters, including lens distortion, from video in an unsupervised manner, thereby allowing us to extract accurate depth and motion from arbitrary videos of unknown origin at scale. We evaluate our results on the Cityscapes, KITTI and EuRoC datasets, establishing new state of the art on depth prediction and odometry, and demonstrate qualitatively that depth prediction can be learned from a collection of YouTube videos. |
Tasks | Depth Estimation |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.04998v1 |
http://arxiv.org/pdf/1904.04998v1.pdf | |
PWC | https://paperswithcode.com/paper/depth-from-videos-in-the-wild-unsupervised |
Repo | https://github.com/robot-love/depth_from_video_in_the_wild |
Framework | tf |
Formal derivation of Mesh Neural Networks with their Forward-Only gradient Propagation
Title | Formal derivation of Mesh Neural Networks with their Forward-Only gradient Propagation |
Authors | Federico A. Galatolo, Mario G. C. A. Cimino, Gigliola Vaglini |
Abstract | This paper proposes the Mesh Neural Network (MNN), a novel architecture which allows neurons to be connected in any topology, to efficiently route information. In MNNs, information is propagated between neurons throughout a state transition function. State and error gradients are then directly computed from state updates without backward computation. The MNN architecture and the error propagation schema is formalized and derived in tensor algebra. The proposed computational model can fully supply a gradient descent process, and is suitable for very large scale NNs, due to its expressivity and training efficiency, with respect to NNs based on back-propagation and computational graphs. |
Tasks | |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.06684v3 |
https://arxiv.org/pdf/1905.06684v3.pdf | |
PWC | https://paperswithcode.com/paper/formal-derivation-of-mesh-neural-networks |
Repo | https://github.com/galatolofederico/mesh-neural-networks |
Framework | none |
Probabilistic Formulation of the Take The Best Heuristic
Title | Probabilistic Formulation of the Take The Best Heuristic |
Authors | Tomi Peltola, Jussi Jokinen, Samuel Kaski |
Abstract | The framework of cognitively bounded rationality treats problem solving as fundamentally rational, but emphasises that it is constrained by cognitive architecture and the task environment. This paper investigates a simple decision making heuristic, Take The Best (TTB), within that framework. We formulate TTB as a likelihood-based probabilistic model, where the decision strategy arises by probabilistic inference based on the training data and the model constraints. The strengths of the probabilistic formulation, in addition to providing a bounded rational account of the learning of the heuristic, include natural extensibility with additional cognitively plausible constraints and prior information, and the possibility to embed the heuristic as a subpart of a larger probabilistic model. We extend the model to learn cue discrimination thresholds for continuous-valued cues and experiment with using the model to account for biased preference feedback from a boundedly rational agent in a simulated interactive machine learning task. |
Tasks | Decision Making |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00572v1 |
https://arxiv.org/pdf/1911.00572v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-formulation-of-the-take-the |
Repo | https://github.com/to-mi/pttb |
Framework | none |
Object-Contextual Representations for Semantic Segmentation
Title | Object-Contextual Representations for Semantic Segmentation |
Authors | Yuhui Yuan, Xilin Chen, Jingdong Wang |
Abstract | In this paper, we address the semantic segmentation problem with a focus on the context aggregation strategy. Our motivation is that the label of a pixel is the category of the object that the pixel belongs to. We present a simple yet effective approach, object-contextual representations, characterizing a pixel by exploiting the representation of the corresponding object class. First, we learn object regions under the supervision of the ground-truth segmentation. Second, we compute the object region representation by aggregating the representations of the pixels lying in the object region. Last, % the representation similarity we compute the relation between each pixel and each object region, and augment the representation of each pixel with the object-contextual representation which is a weighted aggregation of all the object region representations according to their relations with the pixel. We empirically demonstrate that the proposed approach achieves competitive performance on various challenging semantic segmentation benchmarks: Cityscapes, ADE20K, LIP, PASCAL-Context, and COCO-Stuff. |
Tasks | Semantic Segmentation |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.11065v2 |
https://arxiv.org/pdf/1909.11065v2.pdf | |
PWC | https://paperswithcode.com/paper/object-contextual-representations-for |
Repo | https://github.com/PkuRainBow/OCNet |
Framework | pytorch |