Paper Group ANR 31
Coarse-Refinement Dilemma: On Generalization Bounds for Data Clustering. Fine-Grained Spoiler Detection from Large-Scale Review Corpora. Almost Sure Asymptotic Freeness of Neural Network Jacobian with Orthogonal Weights. Multi-Spectral Visual Odometry without Explicit Stereo Matching. If MaxEnt RL is the Answer, What is the Question?. Kinematic Syn …
Coarse-Refinement Dilemma: On Generalization Bounds for Data Clustering
Title | Coarse-Refinement Dilemma: On Generalization Bounds for Data Clustering |
Authors | Yule Vaz, Rodrigo Fernandes de Mello, Carlos Henrique Grossi |
Abstract | The Data Clustering (DC) problem is of central importance for the area of Machine Learning (ML), given its usefulness to represent data structural similarities from input spaces. Differently from Supervised Machine Learning (SML), which relies on the theoretical frameworks of the Statistical Learning Theory (SLT) and the Algorithm Stability (AS), DC has scarce literature on general-purpose learning guarantees, affecting conclusive remarks on how those algorithms should be designed as well as on the validity of their results. In this context, this manuscript introduces a new concept, based on multidimensional persistent homology, to analyze the conditions on which a clustering model is capable of generalizing data. As a first step, we propose a more general definition of DC problem by relying on Topological Spaces, instead of metric ones as typically approached in the literature. From that, we show that the DC problem presents an analogous dilemma to the Bias-Variance one, which is here referred to as the Coarse-Refinement (CR) dilemma. CR is intended to clarify the contrast between: (i) highly-refined partitions and the clustering instability (overfitting); and (ii) over-coarse partitions and the lack of representativeness (underfitting); consequently, the CR dilemma suggests the need of a relaxation of Kleinberg’s richness axiom. Experimental results were used to illustrate that multidimensional persistent homology support the measurement of divergences among DC models, leading to a consistency criterion. |
Tasks | |
Published | 2019-11-13 |
URL | https://arxiv.org/abs/1911.05806v1 |
https://arxiv.org/pdf/1911.05806v1.pdf | |
PWC | https://paperswithcode.com/paper/coarse-refinement-dilemma-on-generalization |
Repo | |
Framework | |
Fine-Grained Spoiler Detection from Large-Scale Review Corpora
Title | Fine-Grained Spoiler Detection from Large-Scale Review Corpora |
Authors | Mengting Wan, Rishabh Misra, Ndapa Nakashole, Julian McAuley |
Abstract | This paper presents computational approaches for automatically detecting critical plot twists in reviews of media products. First, we created a large-scale book review dataset that includes fine-grained spoiler annotations at the sentence-level, as well as book and (anonymized) user information. Second, we carefully analyzed this dataset, and found that: spoiler language tends to be book-specific; spoiler distributions vary greatly across books and review authors; and spoiler sentences tend to jointly appear in the latter part of reviews. Third, inspired by these findings, we developed an end-to-end neural network architecture to detect spoiler sentences in review corpora. Quantitative and qualitative results demonstrate that the proposed method substantially outperforms existing baselines. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13416v1 |
https://arxiv.org/pdf/1905.13416v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-spoiler-detection-from-large |
Repo | |
Framework | |
Almost Sure Asymptotic Freeness of Neural Network Jacobian with Orthogonal Weights
Title | Almost Sure Asymptotic Freeness of Neural Network Jacobian with Orthogonal Weights |
Authors | Tomohiro Hayase |
Abstract | A well-conditioned Jacobian spectrum has a vital role in preventing exploding or vanishing gradients and speeding up learning of deep neural networks. Free probability theory helps us to understand and handle the Jacobian spectrum. We rigorously show almost sure asymptotic freeness of layer-wise Jacobians of deep neural networks as the wide limit. In particular, we treat the case that weights are initialized as Haar distributed orthogonal matrices. |
Tasks | |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03901v4 |
https://arxiv.org/pdf/1908.03901v4.pdf | |
PWC | https://paperswithcode.com/paper/almost-surely-asymptotic-freeness-for |
Repo | |
Framework | |
Multi-Spectral Visual Odometry without Explicit Stereo Matching
Title | Multi-Spectral Visual Odometry without Explicit Stereo Matching |
Authors | Weichen Dai, Yu Zhang, Donglei Sun, Naira Hovakimyan, Ping Li |
Abstract | Multi-spectral sensors consisting of a standard (visible-light) camera and a long-wave infrared camera can simultaneously provide both visible and thermal images. Since thermal images are independent from environmental illumination, they can help to overcome certain limitations of standard cameras under complicated illumination conditions. However, due to the difference in the information source of the two types of cameras, their images usually share very low texture similarity. Hence, traditional texture-based feature matching methods cannot be directly applied to obtain stereo correspondences. To tackle this problem, a multi-spectral visual odometry method without explicit stereo matching is proposed in this paper. Bundle adjustment of multi-view stereo is performed on the visible and the thermal images using direct image alignment. Scale drift can be avoided by additional temporal observations of map points with the fixed-baseline stereo. Experimental results indicate that the proposed method can provide accurate visual odometry results with recovered metric scale. Moreover, the proposed method can also provide a metric 3D reconstruction in semi-dense density with multi-spectral information, which is not available from existing multi-spectral methods. |
Tasks | 3D Reconstruction, Stereo Matching, Stereo Matching Hand, Visual Odometry |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08814v1 |
https://arxiv.org/pdf/1908.08814v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-spectral-visual-odometry-without |
Repo | |
Framework | |
If MaxEnt RL is the Answer, What is the Question?
Title | If MaxEnt RL is the Answer, What is the Question? |
Authors | Benjamin Eysenbach, Sergey Levine |
Abstract | Experimentally, it has been observed that humans and animals often make decisions that do not maximize their expected utility, but rather choose outcomes randomly, with probability proportional to expected utility. Probability matching, as this strategy is called, is equivalent to maximum entropy reinforcement learning (MaxEnt RL). However, MaxEnt RL does not optimize expected utility. In this paper, we formally show that MaxEnt RL does optimally solve certain classes of control problems with variability in the reward function. In particular, we show (1) that MaxEnt RL can be used to solve a certain class of POMDPs, and (2) that MaxEnt RL is equivalent to a two-player game where an adversary chooses the reward function. These results suggest a deeper connection between MaxEnt RL, robust control, and POMDPs, and provide insight for the types of problems for which we might expect MaxEnt RL to produce effective solutions. Specifically, our results suggest that domains with uncertainty in the task goal may be especially well-suited for MaxEnt RL methods. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01913v1 |
https://arxiv.org/pdf/1910.01913v1.pdf | |
PWC | https://paperswithcode.com/paper/if-maxent-rl-is-the-answer-what-is-the |
Repo | |
Framework | |
Kinematic Synthesis of Parallel Manipulator via Neural Network Approach
Title | Kinematic Synthesis of Parallel Manipulator via Neural Network Approach |
Authors | J. Ghasemi, R. Moradinezhad, M. A. Hosseini |
Abstract | In this research, Artificial Neural Networks (ANNs) have been used as a powerful tool to solve the inverse kinematic equations of a parallel robot. For this purpose, we have developed the kinematic equations of a Tricept parallel kinematic mechanism with two rotational and one translational degrees of freedom (DoF). Using the analytical method, the inverse kinematic equations are solved for specific trajectory, and used as inputs for the applied ANNs. The results of both applied networks (Multi-Layer Perceptron and Redial Basis Function) satisfied the required performance in solving complex inverse kinematics with proper accuracy and speed. |
Tasks | |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.04668v1 |
http://arxiv.org/pdf/1904.04668v1.pdf | |
PWC | https://paperswithcode.com/paper/kinematic-synthesis-of-parallel-manipulator |
Repo | |
Framework | |
Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry
Title | Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry |
Authors | Shunkai Li, Fei Xue, Xin Wang, Zike Yan, Hongbin Zha |
Abstract | We propose a self-supervised learning framework for visual odometry (VO) that incorporates correlation of consecutive frames and takes advantage of adversarial learning. Previous methods tackle self-supervised VO as a local structure from motion (SfM) problem that recovers depth from single image and relative poses from image pairs by minimizing photometric loss between warped and captured images. As single-view depth estimation is an ill-posed problem, and photometric loss is incapable of discriminating distortion artifacts of warped images, the estimated depth is vague and pose is inaccurate. In contrast to previous methods, our framework learns a compact representation of frame-to-frame correlation, which is updated by incorporating sequential information. The updated representation is used for depth estimation. Besides, we tackle VO as a self-supervised image generation task and take advantage of Generative Adversarial Networks (GAN). The generator learns to estimate depth and pose to generate a warped target image. The discriminator evaluates the quality of generated image with high-level structural perception that overcomes the problem of pixel-wise loss in previous methods. Experiments on KITTI and Cityscapes datasets show that our method obtains more accurate depth with details preserved and predicted pose outperforms state-of-the-art self-supervised methods significantly. |
Tasks | Depth Estimation, Image Generation, Visual Odometry |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08704v1 |
https://arxiv.org/pdf/1908.08704v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-adversarial-learning-for-self |
Repo | |
Framework | |
Evolutionary Trigger Set Generation for DNN Black-Box Watermarking
Title | Evolutionary Trigger Set Generation for DNN Black-Box Watermarking |
Authors | Jia Guo, Miodrag Potkonjak |
Abstract | The commercialization of deep learning creates a compelling need for intellectual property (IP) protection. Deep neural network (DNN) watermarking has been proposed as a promising tool to help model owners prove ownership and fight piracy. A popular approach of watermarking is to train a DNN to recognize images with certain \textit{trigger} patterns. In this paper, we propose a novel evolutionary algorithm-based method to generate and optimize trigger patterns. Our method brings a siginificant reduction in false positive rates, leading to compelling proof of ownership. At the same time, it maintains the robustness of the watermark against attacks. We compare our method with the prior art and demonstrate its effectiveness on popular models and datasets. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04411v1 |
https://arxiv.org/pdf/1906.04411v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-trigger-set-generation-for-dnn |
Repo | |
Framework | |
Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices
Title | Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices |
Authors | Vincent S. Chen, Sen Wu, Zhenzhen Weng, Alexander Ratner, Christopher Ré |
Abstract | In real-world machine learning applications, data subsets correspond to especially critical outcomes: vulnerable cyclist detections are safety-critical in an autonomous driving task, and “question” sentences might be important to a dialogue agent’s language understanding for product purposes. While machine learning models can achieve high quality performance on coarse-grained metrics like F1-score and overall accuracy, they may underperform on critical subsets—we define these as slices, the key abstraction in our approach. To address slice-level performance, practitioners often train separate “expert” models on slice subsets or use multi-task hard parameter sharing. We propose Slice-based Learning, a new programming model in which the slicing function (SF), a programming interface, specifies critical data subsets for which the model should commit additional capacity. Any model can leverage SFs to learn slice expert representations, which are combined with an attention mechanism to make slice-aware predictions. We show that our approach maintains a parameter-efficient representation while improving over baselines by up to 19.0 F1 on slices and 4.6 F1 overall on datasets spanning language understanding (e.g. SuperGLUE), computer vision, and production-scale industrial systems. |
Tasks | Autonomous Driving |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06349v2 |
https://arxiv.org/pdf/1909.06349v2.pdf | |
PWC | https://paperswithcode.com/paper/slice-based-learning-a-programming-model-for |
Repo | |
Framework | |
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Title | Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks |
Authors | Sanjeev Arora, Simon S. Du, Wei Hu, Zhiyuan Li, Ruosong Wang |
Abstract | Recent works have cast some light on the mystery of why deep nets fit any data and generalize despite being very overparametrized. This paper analyzes training and generalization for a simple 2-layer ReLU net with random initialization, and provides the following improvements over recent works: (i) Using a tighter characterization of training speed than recent papers, an explanation for why training a neural net with random labels leads to slower training, as originally observed in [Zhang et al. ICLR’17]. (ii) Generalization bound independent of network size, using a data-dependent complexity measure. Our measure distinguishes clearly between random labels and true labels on MNIST and CIFAR, as shown by experiments. Moreover, recent papers require sample complexity to increase (slowly) with the size, while our sample complexity is completely independent of the network size. (iii) Learnability of a broad class of smooth functions by 2-layer ReLU nets trained via gradient descent. The key idea is to track dynamics of training and generalization via properties of a related kernel. |
Tasks | |
Published | 2019-01-24 |
URL | https://arxiv.org/abs/1901.08584v2 |
https://arxiv.org/pdf/1901.08584v2.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-analysis-of-optimization-and |
Repo | |
Framework | |
Artificial Intelligence and the Future of Psychiatry: Qualitative Findings from a Global Physician Survey
Title | Artificial Intelligence and the Future of Psychiatry: Qualitative Findings from a Global Physician Survey |
Authors | Charlotte Blease, Cosima Locher, Marisa Leon-Carlyle, P. Murali Doraiswamy |
Abstract | The potential for machine learning to disrupt the medical profession is the subject of ongoing debate within biomedical informatics. This study aimed to explore psychiatrists’ opinions about the potential impact of innovations in artificial intelligence and machine learning on psychiatric practice. In Spring 2019, we conducted a web-based survey of 791 psychiatrists from 22 countries worldwide. The survey measured opinions about the likelihood future technology would fully replace physicians in performing ten key psychiatric tasks. This study involved qualitative descriptive analysis of written response to three open-ended questions in the survey. Comments were classified into four major categories in relation to the impact of future technology on patient-psychiatric interactions, the quality of patient medical care, the profession of psychiatry, and health systems. Overwhelmingly, psychiatrists were skeptical that technology could fully replace human empathy. Many predicted that ‘man and machine’ would increasingly collaborate in undertaking clinical decisions, with mixed opinions about the benefits and harms of such an arrangement. Participants were optimistic that technology might improve efficiencies and access to care, and reduce costs. Ethical and regulatory considerations received limited attention. This study presents timely information of psychiatrists’ view about the scope of artificial intelligence and machine learning on psychiatric practice. Psychiatrists expressed divergent views about the value and impact of future technology with worrying omissions about practice guidelines, and ethical and regulatory issues. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09956v1 |
https://arxiv.org/pdf/1910.09956v1.pdf | |
PWC | https://paperswithcode.com/paper/artificial-intelligence-and-the-future-of-1 |
Repo | |
Framework | |
Single Image Blind Deblurring Using Multi-Scale Latent Structure Prior
Title | Single Image Blind Deblurring Using Multi-Scale Latent Structure Prior |
Authors | Yuanchao Bai, Huizhu Jia, Ming Jiang, Xianming Liu, Xiaodong Xie, Wen Gao |
Abstract | Blind image deblurring is a challenging problem in computer vision, which aims to restore both the blur kernel and the latent sharp image from only a blurry observation. Inspired by the prevalent self-example prior in image super-resolution, in this paper, we observe that a coarse enough image down-sampled from a blurry observation is approximately a low-resolution version of the latent sharp image. We prove this phenomenon theoretically and define the coarse enough image as a latent structure prior of the unknown sharp image. Starting from this prior, we propose to restore sharp images from the coarsest scale to the finest scale on a blurry image pyramid, and progressively update the prior image using the newly restored sharp image. These coarse-to-fine priors are referred to as \textit{Multi-Scale Latent Structures} (MSLS). Leveraging the MSLS prior, our algorithm comprises two phases: 1) we first preliminarily restore sharp images in the coarse scales; 2) we then apply a refinement process in the finest scale to obtain the final deblurred image. In each scale, to achieve lower computational complexity, we alternately perform a sharp image reconstruction with fast local self-example matching, an accelerated kernel estimation with error compensation, and a fast non-blind image deblurring, instead of computing any computationally expensive non-convex priors. We further extend the proposed algorithm to solve more challenging non-uniform blind image deblurring problem. Extensive experiments demonstrate that our algorithm achieves competitive results against the state-of-the-art methods with much faster running speed. |
Tasks | Blind Image Deblurring, Deblurring, Image Reconstruction, Image Super-Resolution, Single-Image Blind Deblurring, Super-Resolution |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04442v1 |
https://arxiv.org/pdf/1906.04442v1.pdf | |
PWC | https://paperswithcode.com/paper/single-image-blind-deblurring-using-multi |
Repo | |
Framework | |
MinneApple: A Benchmark Dataset for Apple Detection and Segmentation
Title | MinneApple: A Benchmark Dataset for Apple Detection and Segmentation |
Authors | Nicolai Häni, Pravakar Roy, Volkan Isler |
Abstract | In this work, we present a new dataset to advance the state-of-the-art in fruit detection, segmentation, and counting in orchard environments. While there has been significant recent interest in solving these problems, the lack of a unified dataset has made it difficult to compare results. We hope to enable direct comparisons by providing a large variety of high-resolution images acquired in orchards, together with human annotations of the fruit on trees. The fruits are labeled using polygonal masks for each object instance to aid in precise object detection, localization, and segmentation. Additionally, we provide data for patch-based counting of clustered fruits. Our dataset contains over 41, 000 annotated object instances in 1000 images. We present a detailed overview of the dataset together with baseline performance analysis for bounding box detection, segmentation, and fruit counting as well as representative results for yield estimation. We make this dataset publicly available and host a CodaLab challenge to encourage comparison of results on a common dataset. To download the data and learn more about MinneApple please see the project website: http://rsn.cs.umn.edu/index.php/MinneApple. Up to date information is available online. |
Tasks | Object Detection |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06441v2 |
https://arxiv.org/pdf/1909.06441v2.pdf | |
PWC | https://paperswithcode.com/paper/minneapple-a-benchmark-dataset-for-apple |
Repo | |
Framework | |
Some Aspects of Geometric Computer Vision for Analysing Dynamical Scenes focusing Automotive Applications
Title | Some Aspects of Geometric Computer Vision for Analysing Dynamical Scenes focusing Automotive Applications |
Authors | Volker Willert, Martin Buczko |
Abstract | This draft summarizes some basics about geometric computer vision needed to implement efficient computer vision algorithms for applications that use measurements from at least one digital camera mounted on a moving platform with a special focus on automotive applications processing image streams taken from cameras mounted on a car. Our intention is twofold: On the one hand, we would like to introduce well-known basic geometric relations in a compact way that can also be found in lecture books about geometric computer vision like [1, 2]. On the other hand, we would like to share some experience about subtleties that should be taken into account in order to set up quite simple but robust and fast vision algorithms that are able to run in real time. We added a conglomeration of literature, we found to be relevant when implementing basic algorithms like optical flow, visual odometry and structure from motion. The reader should get some feeling about how the estimates of these algorithms are interrelated, which parts of the algorithms are critical in terms of robustness and what kind of additional assumptions can be useful to constrain the solution space of the underlying usually non-convex optimization problems. |
Tasks | Optical Flow Estimation, Visual Odometry |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06726v1 |
https://arxiv.org/pdf/1908.06726v1.pdf | |
PWC | https://paperswithcode.com/paper/some-aspects-of-geometric-computer-vision-for |
Repo | |
Framework | |
Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation
Title | Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation |
Authors | Hongdong Zheng, Yalong Bai, Wei Zhang, Tao Mei |
Abstract | The significant progress on Generative Adversarial Networks (GANs) have made it possible to generate surprisingly realistic images for single object based on natural language descriptions. However, controlled generation of images for multiple entities with explicit interactions is still difficult to achieve due to the scene layout generation heavily suffer from the diversity object scaling and spatial locations. In this paper, we proposed a novel framework for generating realistic image layout from textual scene graphs. In our framework, a spatial constraint module is designed to fit reasonable scaling and spatial layout of object pairs with considering relationship between them. Moreover, a contextual fusion module is introduced for fusing pair-wise spatial information in terms of object dependency in scene graph. By using these two modules, our proposed framework tends to generate more commonsense layout which is helpful for realistic image generation. Experimental results including quantitative results, qualitative results and user studies on two different scene graph datasets demonstrate our proposed framework’s ability to generate complex and logical layout with multiple objects from scene graph. |
Tasks | Image Generation |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00640v2 |
https://arxiv.org/pdf/1909.00640v2.pdf | |
PWC | https://paperswithcode.com/paper/relationship-aware-spatial-perception-fusion |
Repo | |
Framework | |