Paper Group ANR 106
Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints. Generating Adversarial Perturbation with Root Mean Square Gradient. Streamlined Dense Video Captioning. Towards Reliable, Automated General Movement Assessment for Perinatal Stroke Screening in Infants Using Wearable Accelerometers. Analyzing and Interpreting Neural Net …
Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints
Title | Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints |
Authors | Omid Sadeghi, Maryam Fazel |
Abstract | In this paper, we study a class of online optimization problems with long-term budget constraints where the objective functions are not necessarily concave (nor convex) but they instead satisfy the Diminishing Returns (DR) property. Specifically, a sequence of monotone DR-submodular objective functions ${f_t(x)}{t=1}^T$ and monotone linear budget functions ${\langle p_t,x \rangle }{t=1}^T$ arrive over time and assuming a total targeted budget $B_T$, the goal is to choose points $x_t$ at each time $t\in{1,\dots,T}$, without knowing $f_t$ and $p_t$ on that step, to achieve sub-linear regret bound while the total budget violation $\sum_{t=1}^T \langle p_t,x_t \rangle -B_T$ is sub-linear as well. Prior work has shown that achieving sub-linear regret is impossible if the budget functions are chosen adversarially. Therefore, we modify the notion of regret by comparing the agent against a $(1-\frac{1}{e})$-approximation to the best fixed decision in hindsight which satisfies the budget constraint proportionally over any window of length $W$. We propose the Online Saddle Point Hybrid Gradient (OSPHG) algorithm to solve this class of online problems. For $W=T$, we recover the aforementioned impossibility result. However, when $W=o(T)$, we show that it is possible to obtain sub-linear bounds for both the $(1-\frac{1}{e})$-regret and the total budget violation. |
Tasks | |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00316v1 |
https://arxiv.org/pdf/1907.00316v1.pdf | |
PWC | https://paperswithcode.com/paper/online-continuous-dr-submodular-maximization |
Repo | |
Framework | |
Generating Adversarial Perturbation with Root Mean Square Gradient
Title | Generating Adversarial Perturbation with Root Mean Square Gradient |
Authors | Yatie Xiao, Chi-Man Pun, Jizhe Zhou |
Abstract | We focus our attention on the problem of generating adversarial perturbations based on the gradient in image classification domain |
Tasks | Image Classification |
Published | 2019-01-13 |
URL | https://arxiv.org/abs/1901.03706v5 |
https://arxiv.org/pdf/1901.03706v5.pdf | |
PWC | https://paperswithcode.com/paper/generating-adversarial-perturbation-with-root |
Repo | |
Framework | |
Streamlined Dense Video Captioning
Title | Streamlined Dense Video Captioning |
Authors | Jonghwan Mun, Linjie Yang, Zhou Ren, Ning Xu, Bohyung Han |
Abstract | Dense video captioning is an extremely challenging task since accurate and coherent description of events in a video requires holistic understanding of video contents as well as contextual reasoning of individual events. Most existing approaches handle this problem by first detecting event proposals from a video and then captioning on a subset of the proposals. As a result, the generated sentences are prone to be redundant or inconsistent since they fail to consider temporal dependency between events. To tackle this challenge, we propose a novel dense video captioning framework, which models temporal dependency across events in a video explicitly and leverages visual and linguistic context from prior events for coherent storytelling. This objective is achieved by 1) integrating an event sequence generation network to select a sequence of event proposals adaptively, and 2) feeding the sequence of event proposals to our sequential video captioning network, which is trained by reinforcement learning with two-level rewards at both event and episode levels for better context modeling. The proposed technique achieves outstanding performances on ActivityNet Captions dataset in most metrics. |
Tasks | Dense Video Captioning, Video Captioning |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.03870v1 |
http://arxiv.org/pdf/1904.03870v1.pdf | |
PWC | https://paperswithcode.com/paper/streamlined-dense-video-captioning |
Repo | |
Framework | |
Towards Reliable, Automated General Movement Assessment for Perinatal Stroke Screening in Infants Using Wearable Accelerometers
Title | Towards Reliable, Automated General Movement Assessment for Perinatal Stroke Screening in Infants Using Wearable Accelerometers |
Authors | Yan Gao, Yang Long, Yu Guan, Anna Basu, Jessica Baggaley, Thomas Ploetz |
Abstract | Perinatal stroke (PS) is a serious condition that, if undetected and thus untreated, often leads to life-long disability, in particular Cerebral Palsy (CP). In clinical settings, Prechtl’s General Movement Assessment (GMA) can be used to classify infant movements using a Gestalt approach, identifying infants at high risk of developing PS. Training and maintenance of assessment skills are essential and expensive for the correct use of GMA, yet many practitioners lack these skills, preventing larger-scale screening and leading to significant risks of missing opportunities for early detection and intervention for affected infants. We present an automated approach to GMA, based on body-worn accelerometers and a novel sensor data analysis method-Discriminative Pattern Discovery (DPD)-that is designed to cope with scenarios where only coarse annotations of data are available for model training. We demonstrate the effectiveness of our approach in a study with 34 newborns (21 typically developing infants and 13 PS infants with abnormal movements). Our method is able to correctly recognise the trials with abnormal movements with at least the accuracy that is required by newly trained human annotators (75%), which is encouraging towards our ultimate goal of an automated PS screening system that can be used population-wide. |
Tasks | |
Published | 2019-02-21 |
URL | http://arxiv.org/abs/1902.08068v1 |
http://arxiv.org/pdf/1902.08068v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-reliable-automated-general-movement |
Repo | |
Framework | |
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
Title | Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop |
Authors | Afra Alishahi, Grzegorz Chrupała, Tal Linzen |
Abstract | The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category. |
Tasks | |
Published | 2019-04-05 |
URL | http://arxiv.org/abs/1904.04063v1 |
http://arxiv.org/pdf/1904.04063v1.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-and-interpreting-neural-networks |
Repo | |
Framework | |
NetTailor: Tuning the Architecture, Not Just the Weights
Title | NetTailor: Tuning the Architecture, Not Just the Weights |
Authors | Pedro Morgado, Nuno Vasconcelos |
Abstract | Real-world applications of object recognition often require the solution of multiple tasks in a single platform. Under the standard paradigm of network fine-tuning, an entirely new CNN is learned per task, and the final network size is independent of task complexity. This is wasteful, since simple tasks require smaller networks than more complex tasks, and limits the number of tasks that can be solved simultaneously. To address these problems, we propose a transfer learning procedure, denoted NetTailor, in which layers of a pre-trained CNN are used as universal blocks that can be combined with small task-specific layers to generate new networks. Besides minimizing classification error, the new network is trained to mimic the internal activations of a strong unconstrained CNN, and minimize its complexity by the combination of 1) a soft-attention mechanism over blocks and 2) complexity regularization constraints. In this way, NetTailor can adapt the network architecture, not just its weights, to the target task. Experiments show that networks adapted to simple tasks, such as character or traffic sign recognition, become significantly smaller than those adapted to hard tasks, such as fine-grained recognition. More importantly, due to the modular nature of the procedure, this reduction in network complexity is achieved without compromise of either parameter sharing across tasks, or classification accuracy. |
Tasks | Object Recognition, Traffic Sign Recognition, Transfer Learning |
Published | 2019-06-29 |
URL | https://arxiv.org/abs/1907.00274v1 |
https://arxiv.org/pdf/1907.00274v1.pdf | |
PWC | https://paperswithcode.com/paper/nettailor-tuning-the-architecture-not-just-1 |
Repo | |
Framework | |
Integrating Knowledge and Reasoning in Image Understanding
Title | Integrating Knowledge and Reasoning in Image Understanding |
Authors | Somak Aditya, Yezhou Yang, Chitta Baral |
Abstract | Deep learning based data-driven approaches have been successfully applied in various image understanding applications ranging from object recognition, semantic segmentation to visual question answering. However, the lack of knowledge integration as well as higher-level reasoning capabilities with the methods still pose a hindrance. In this work, we present a brief survey of a few representative reasoning mechanisms, knowledge integration methods and their corresponding image understanding applications developed by various groups of researchers, approaching the problem from a variety of angles. Furthermore, we discuss upon key efforts on integrating external knowledge with neural networks. Taking cues from these efforts, we conclude by discussing potential pathways to improve reasoning capabilities. |
Tasks | Object Recognition, Question Answering, Semantic Segmentation, Visual Question Answering |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09954v1 |
https://arxiv.org/pdf/1906.09954v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-knowledge-and-reasoning-in-image |
Repo | |
Framework | |
A shallow residual neural network to predict the visual cortex response
Title | A shallow residual neural network to predict the visual cortex response |
Authors | Anne-Ruth José Meijer, Arnoud Visser |
Abstract | Understanding how the visual cortex of the human brain really works is still an open problem for science today. A better understanding of natural intelligence could also benefit object-recognition algorithms based on convolutional neural networks. In this paper we demonstrate the asset of using a shallow residual neural network for this task. The benefit of this approach is that earlier stages of the network can be accurately trained, which allows us to add more layers at the earlier stage. With this additional layer the prediction of the visual brain activity improves from $10.4%$ (block 1) to $15.53%$ (last fully connected layer). By training the network for more than 10 epochs this improvement can become even larger. |
Tasks | Object Recognition |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11578v1 |
https://arxiv.org/pdf/1906.11578v1.pdf | |
PWC | https://paperswithcode.com/paper/a-shallow-residual-neural-network-to-predict |
Repo | |
Framework | |
AlignFlow: Cycle Consistent Learning from Multiple Domains via Normalizing Flows
Title | AlignFlow: Cycle Consistent Learning from Multiple Domains via Normalizing Flows |
Authors | Aditya Grover, Christopher Chute, Rui Shu, Zhangjie Cao, Stefano Ermon |
Abstract | Given datasets from multiple domains, a key challenge is to efficiently exploit these data sources for modeling a target domain. Variants of this problem have been studied in many contexts, such as cross-domain translation and domain adaptation. We propose AlignFlow, a generative modeling framework that models each domain via a normalizing flow. The use of normalizing flows allows for a) flexibility in specifying learning objectives via adversarial training, maximum likelihood estimation, or a hybrid of the two methods; and b) learning and exact inference of a shared representation in the latent space of the generative model. We derive a uniform set of conditions under which AlignFlow is marginally-consistent for the different learning objectives. Furthermore, we show that AlignFlow guarantees exact cycle consistency in mapping datapoints from a source domain to target and back to the source domain. Empirically, AlignFlow outperforms relevant baselines on image-to-image translation and unsupervised domain adaptation and can be used to simultaneously interpolate across the various domains using the learned representation. |
Tasks | Density Estimation, Domain Adaptation, Image-to-Image Translation, Unsupervised Domain Adaptation |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.12892v2 |
https://arxiv.org/pdf/1905.12892v2.pdf | |
PWC | https://paperswithcode.com/paper/alignflow-cycle-consistent-learning-from |
Repo | |
Framework | |
Reducing the dilution: An analysis of the information sensitiveness of capsule network with a practical improvement method
Title | Reducing the dilution: An analysis of the information sensitiveness of capsule network with a practical improvement method |
Authors | Zonglin Yang, Xinggang Wang |
Abstract | Capsule network has shown various advantages over convolutional neural network (CNN). It keeps more precise spatial information than CNN and uses equivariance instead of invariance during inference and highly potential to be a new effective tool for visual tasks. However, the current capsule networks have incompatible performance with CNN when facing datasets with background and complex target objects and are lacking in universal and efficient regularization method. We analyze a main reason of the incompatible performance as the conflict between information sensitiveness of capsule network and unreasonably higher activation value distribution of capsules in primary capsule layer. Correspondingly, we propose a practical improvement method by restraining the activation value of capsules in primary capsule layer to suppress non-informative capsules and highlight discriminative capsules. In the experiments, the method has achieved better performances on various mainstream datasets. In addition, the proposed improvement methods can be seen as a suitable, simple and efficient regularization method that can be generally used in capsule network. |
Tasks | |
Published | 2019-03-25 |
URL | https://arxiv.org/abs/1903.10588v3 |
https://arxiv.org/pdf/1903.10588v3.pdf | |
PWC | https://paperswithcode.com/paper/reducing-the-dilution-analysis-of-the |
Repo | |
Framework | |
Long-Duration Fully Autonomous Operation of Rotorcraft Unmanned Aerial Systems for Remote-Sensing Data Acquisition
Title | Long-Duration Fully Autonomous Operation of Rotorcraft Unmanned Aerial Systems for Remote-Sensing Data Acquisition |
Authors | Danylo Malyuta, Christian Brommer, Daniel Hentzen, Thomas Stastny, Roland Siegwart, Roland Brockers |
Abstract | Recent applications of unmanned aerial systems (UAS) to precision agriculture have shown increased ease and efficiency in data collection at precise remote locations. However, further enhancement of the field requires operation over long periods of time, e.g. days or weeks. This has so far been impractical due to the limited flight times of such platforms and the requirement of humans in the loop for operation. To overcome these limitations, we propose a fully autonomous rotorcraft UAS that is capable of performing repeated flights for long-term observation missions without any human intervention. We address two key technologies that are critical for such a system: full platform autonomy to enable mission execution independently from human operators and the ability of vision-based precision landing on a recharging station for automated energy replenishment. High-level autonomous decision making is implemented as a hierarchy of master and slave state machines. Vision-based precision landing is enabled by estimating the landing pad’s pose using a bundle of AprilTag fiducials configured for detection from a wide range of altitudes. We provide an extensive evaluation of the landing pad pose estimation accuracy as a function of the bundle’s geometry. The functionality of the complete system is demonstrated through two indoor experiments with a duration of 11 and 10.6 hours, and one outdoor experiment with a duration of 4 hours. The UAS executed 16, 48 and 22 flights respectively during these experiments. In the outdoor experiment, the ratio between flying to collect data and charging was 1 to 10, which is similar to past work in this domain. All flights were fully autonomous with no human in the loop. To our best knowledge this is the first research publication about the long-term outdoor operation of a quadrotor system with no human interaction. |
Tasks | Decision Making, Pose Estimation |
Published | 2019-08-18 |
URL | https://arxiv.org/abs/1908.06381v1 |
https://arxiv.org/pdf/1908.06381v1.pdf | |
PWC | https://paperswithcode.com/paper/long-duration-fully-autonomous-operation-of |
Repo | |
Framework | |
Generalizing Energy-based Generative ConvNets from Particle Evolution Perspective
Title | Generalizing Energy-based Generative ConvNets from Particle Evolution Perspective |
Authors | Yang Wu, Xu Cai, Pengxu Wei, Guanbin Li, Liang Lin |
Abstract | Compared with Generative Adversarial Networks (GAN), Energy-Based generative Models (EBMs) possess two appealing properties: i) they can be directly optimized without requiring an auxiliary network during the learning and synthesizing; ii) they can better approximate underlying distribution of the observed data by learning explicitly potential functions. This paper studies a branch of EBMs, i.e., energy-based Generative ConvNets (GCNs), which minimize their energy function defined by a bottom-up ConvNet. From the perspective of particle physics, we solve the problem of unstable energy dissipation that might damage the quality of the synthesized samples during the maximum likelihood learning. Specifically, we firstly establish a connection between classical FRAME model [1] and dynamic physics process and generalize the GCN in discrete flow with a certain metric measure from particle perspective. To address KL-vanishing issue, we then reformulate GCN from the KL discrete flow with KL divergence measure to a Jordan-Kinderleher-Otto (JKO) discrete flow with Wasserastein distance metric and derive a Wasserastein GCN (wGCN). Based on these theoretical studies on GCN, we finally derive a Generalized GCN (GGCN) to further improve the model generalization and learning capability. GGCN introduces a hidden space mapping strategy by employing a normal distribution for the reference distribution to address the learning bias issue. Due to MCMC sampling in GCNs, it still suffers from a serious time-consuming issue when sampling steps increase; thus a trainable non-linear upsampling function and an amortized learning are proposed to improve the learning efficiency. Our proposed GGCN is trained in a symmetrical learning manner. Our method surpass the existing models in both model stability and the quality of generated samples on several widely-used face and natural image datasets. |
Tasks | |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14216v5 |
https://arxiv.org/pdf/1910.14216v5.pdf | |
PWC | https://paperswithcode.com/paper/generalizing-energy-based-generative-convnets |
Repo | |
Framework | |
Analyzing the Variance of Policy Gradient Estimators for the Linear-Quadratic Regulator
Title | Analyzing the Variance of Policy Gradient Estimators for the Linear-Quadratic Regulator |
Authors | James A. Preiss, Sébastien M. R. Arnold, Chen-Yu Wei, Marius Kloft |
Abstract | We study the variance of the REINFORCE policy gradient estimator in environments with continuous state and action spaces, linear dynamics, quadratic cost, and Gaussian noise. These simple environments allow us to derive bounds on the estimator variance in terms of the environment and noise parameters. We compare the predictions of our bounds to the empirical variance in simulation experiments. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01249v1 |
https://arxiv.org/pdf/1910.01249v1.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-the-variance-of-policy-gradient |
Repo | |
Framework | |
Challenges in Designing Datasets and Validation for Autonomous Driving
Title | Challenges in Designing Datasets and Validation for Autonomous Driving |
Authors | Michal Uricar, David Hurych, Pavel Krizek, Senthil Yogamani |
Abstract | Autonomous driving is getting a lot of attention in the last decade and will be the hot topic at least until the first successful certification of a car with Level 5 autonomy. There are many public datasets in the academic community. However, they are far away from what a robust industrial production system needs. There is a large gap between academic and industrial setting and a substantial way from a research prototype, built on public datasets, to a deployable solution which is a challenging task. In this paper, we focus on bad practices that often happen in the autonomous driving from an industrial deployment perspective. Data design deserves at least the same amount of attention as the model design. There is very little attention paid to these issues in the scientific community, and we hope this paper encourages better formalization of dataset design. More specifically, we focus on the datasets design and validation scheme for autonomous driving, where we would like to highlight the common problems, wrong assumptions, and steps towards avoiding them, as well as some open problems. |
Tasks | Autonomous Driving |
Published | 2019-01-26 |
URL | http://arxiv.org/abs/1901.09270v1 |
http://arxiv.org/pdf/1901.09270v1.pdf | |
PWC | https://paperswithcode.com/paper/challenges-in-designing-datasets-and |
Repo | |
Framework | |
WiCV 2019: The Sixth Women In Computer Vision Workshop
Title | WiCV 2019: The Sixth Women In Computer Vision Workshop |
Authors | Irene Amerini, Elena Balashova, Sayna Ebrahimi, Kathryn Leonard, Arsha Nagrani, Amaia Salvador |
Abstract | In this paper we present the Women in Computer Vision Workshop - WiCV 2019, organized in conjunction with CVPR 2019. This event is meant for increasing the visibility and inclusion of women researchers in the computer vision field. Computer vision and machine learning have made incredible progress over the past years, but the number of female researchers is still low both in academia and in industry. WiCV is organized especially for the following reason: to raise visibility of female researchers, to increase collaborations between them, and to provide mentorship to female junior researchers in the field. In this paper, we present a report of trends over the past years, along with a summary of statistics regarding presenters, attendees, and sponsorship for the current workshop. |
Tasks | |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10225v1 |
https://arxiv.org/pdf/1909.10225v1.pdf | |
PWC | https://paperswithcode.com/paper/190910225 |
Repo | |
Framework | |