Paper Group ANR 960
Interpretable Intuitive Physics Model. Real-Time Stereo Vision for Road Surface 3-D Reconstruction. Robustness to fundamental uncertainty in AGI alignment. Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching. Monocular Depth Estimation with Augmented Ordinal Depth Relationships. Learning to Caption Images through a Lifeti …
Interpretable Intuitive Physics Model
Title | Interpretable Intuitive Physics Model |
Authors | Tian Ye, Xiaolong Wang, James Davidson, Abhinav Gupta |
Abstract | Humans have a remarkable ability to use physical commonsense and predict the effect of collisions. But do they understand the underlying factors? Can they predict if the underlying factors have changed? Interestingly, in most cases humans can predict the effects of similar collisions with different conditions such as changes in mass, friction, etc. It is postulated this is primarily because we learn to model physics with meaningful latent variables. This does not imply we can estimate the precise values of these meaningful variables (estimate exact values of mass or friction). Inspired by this observation, we propose an interpretable intuitive physics model where specific dimensions in the bottleneck layers correspond to different physical properties. In order to demonstrate that our system models these underlying physical properties, we train our model on collisions of different shapes (cube, cone, cylinder, spheres etc.) and test on collisions of unseen combinations of shapes. Furthermore, we demonstrate our model generalizes well even when similar scenes are simulated with different underlying properties. |
Tasks | |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.10002v1 |
http://arxiv.org/pdf/1808.10002v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-intuitive-physics-model |
Repo | |
Framework | |
Real-Time Stereo Vision for Road Surface 3-D Reconstruction
Title | Real-Time Stereo Vision for Road Surface 3-D Reconstruction |
Authors | Rui Fan, Yanan Liu, Xingrui Yang, Mohammud Junaid Bocus, Naim Dahnoun, Scott Tancock |
Abstract | Stereo vision techniques have been widely used in civil engineering to acquire 3-D road data. The two important factors of stereo vision are accuracy and speed. However, it is very challenging to achieve both of them simultaneously and therefore the main aim of developing a stereo vision system is to improve the trade-off between these two factors. In this paper, we present a real-time stereo vision system used for road surface 3-D reconstruction. The proposed system is developed from our previously published 3-D reconstruction algorithm where the perspective view of the target image is first transformed into the reference view, which not only increases the disparity accuracy but also improves the processing speed. Then, the correlation cost between each pair of blocks is computed and stored in two 3-D cost volumes. To adaptively aggregate the matching costs from neighbourhood systems, bilateral filtering is performed on the cost volumes. This greatly reduces the ambiguities during stereo matching and further improves the precision of the estimated disparities. Finally, the subpixel resolution is achieved by conducting a parabola interpolation and the subpixel disparity map is used to reconstruct the 3-D road surface. The proposed algorithm is implemented on an NVIDIA GTX 1080 GPU for the real-time purpose. The experimental results illustrate that the reconstruction accuracy is around 3 mm. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.07433v2 |
http://arxiv.org/pdf/1807.07433v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-stereo-vision-for-road-surface-3-d |
Repo | |
Framework | |
Robustness to fundamental uncertainty in AGI alignment
Title | Robustness to fundamental uncertainty in AGI alignment |
Authors | G Gordon Worley III |
Abstract | The AGI alignment problem has a bimodal distribution of outcomes with most outcomes clustering around the poles of total success and existential, catastrophic failure. Consequently, attempts to solve AGI alignment should, all else equal, prefer false negatives (ignoring research programs that would have been successful) to false positives (pursuing research programs that will unexpectedly fail). Thus, we propose adopting a policy of responding to points of philosophical and practical uncertainty associated with the alignment problem by limiting and choosing necessary assumptions to reduce the risk of false positives. Herein we explore in detail two relevant points of uncertainty that AGI alignment research hinges on—meta-ethical uncertainty and uncertainty about mental phenomena—and show how to reduce false positives in response to them. |
Tasks | |
Published | 2018-07-25 |
URL | https://arxiv.org/abs/1807.09836v2 |
https://arxiv.org/pdf/1807.09836v2.pdf | |
PWC | https://paperswithcode.com/paper/robustness-to-fundamental-uncertainty-in-agi |
Repo | |
Framework | |
Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching
Title | Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching |
Authors | Stepan Tulyakov, Anton Ivanov, Francois Fleuret |
Abstract | End-to-end deep-learning networks recently demonstrated extremely good perfor- mance for stereo matching. However, existing networks are difficult to use for practical applications since (1) they are memory-hungry and unable to process even modest-size images, (2) they have to be trained for a given disparity range. The Practical Deep Stereo (PDS) network that we propose addresses both issues: First, its architecture relies on novel bottleneck modules that drastically reduce the memory footprint in inference, and additional design choices allow to handle greater image size during training. This results in a model that leverages large image context to resolve matching ambiguities. Second, a novel sub-pixel cross- entropy loss combined with a MAP estimator make this network less sensitive to ambiguous matches, and applicable to any disparity range without re-training. We compare PDS to state-of-the-art methods published over the recent months, and demonstrate its superior performance on FlyingThings3D and KITTI sets. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01677v1 |
http://arxiv.org/pdf/1806.01677v1.pdf | |
PWC | https://paperswithcode.com/paper/practical-deep-stereo-pds-toward-applications |
Repo | |
Framework | |
Monocular Depth Estimation with Augmented Ordinal Depth Relationships
Title | Monocular Depth Estimation with Augmented Ordinal Depth Relationships |
Authors | Yuanzhouhan Cao, Tianqi Zhao, Ke Xian, Chunhua Shen, Zhiguo Cao, Shugong Xu |
Abstract | Most existing algorithms for depth estimation from single monocular images need large quantities of metric groundtruth depths for supervised learning. We show that relative depth can be an informative cue for metric depth estimation and can be easily obtained from vast stereo videos. Acquiring metric depths from stereo videos is sometimes impracticable due to the absence of camera parameters. In this paper, we propose to improve the performance of metric depth estimation with relative depths collected from stereo movie videos using existing stereo matching algorithm. We introduce a new “Relative Depth in Stereo” (RDIS) dataset densely labelled with relative depths. We first pretrain a ResNet model on our RDIS dataset. Then we finetune the model on RGB-D datasets with metric ground-truth depths. During our finetuning, we formulate depth estimation as a classification task. This re-formulation scheme enables us to obtain the confidence of a depth prediction in the form of probability distribution. With this confidence, we propose an information gain loss to make use of the predictions that are close to ground-truth. We evaluate our approach on both indoor and outdoor benchmark RGB-D datasets and achieve state-of-the-art performance. |
Tasks | Depth Estimation, Monocular Depth Estimation, Stereo Matching, Stereo Matching Hand |
Published | 2018-06-02 |
URL | https://arxiv.org/abs/1806.00585v2 |
https://arxiv.org/pdf/1806.00585v2.pdf | |
PWC | https://paperswithcode.com/paper/monocular-depth-estimation-with-augmented |
Repo | |
Framework | |
Learning to Caption Images through a Lifetime by Asking Questions
Title | Learning to Caption Images through a Lifetime by Asking Questions |
Authors | Kevin Shen, Amlan Kar, Sanja Fidler |
Abstract | In order to bring artificial agents into our lives, we will need to go beyond supervised learning on closed datasets to having the ability to continuously expand knowledge. Inspired by a student learning in a classroom, we present an agent that can continuously learn by posing natural language questions to humans. Our agent is composed of three interacting modules, one that performs captioning, another that generates questions and a decision maker that learns when to ask questions by implicitly reasoning about the uncertainty of the agent and expertise of the teacher. As compared to current active learning methods which query images for full captions, our agent is able to ask pointed questions to improve the generated captions. The agent trains on the improved captions, expanding its knowledge. We show that our approach achieves better performance using less human supervision than the baselines on the challenging MSCOCO dataset. |
Tasks | Active Learning, Image Captioning |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.00235v3 |
http://arxiv.org/pdf/1812.00235v3.pdf | |
PWC | https://paperswithcode.com/paper/lifelong-learning-for-image-captioning-by |
Repo | |
Framework | |
Exploiting local and global performance of candidate systems for aggregation of summarization techniques
Title | Exploiting local and global performance of candidate systems for aggregation of summarization techniques |
Authors | Parth Mehta, Prasenjit Majumder |
Abstract | With an ever growing number of extractive summarization techniques being proposed, there is less clarity then ever about how good each system is compared to the rest. Several studies highlight the variance in performance of these systems with change in datasets or even across documents within the same corpus. An effective way to counter this variance and to make the systems more robust could be to use inputs from multiple systems when generating a summary. In the present work, we define a novel way of creating such ensemble by exploiting similarity between the content of candidate summaries to estimate their reliability. We define GlobalRank which captures the performance of a candidate system on an overall corpus and LocalRank which estimates its performance on a given document cluster. We then use these two scores to assign a weight to each individual systems, which is then used to generate the new aggregate ranking. Experiments on DUC2003 and DUC 2004 datasets show a significant improvement in terms of ROUGE score, over existing sate-of-art techniques. |
Tasks | |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02343v1 |
http://arxiv.org/pdf/1809.02343v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-local-and-global-performance-of |
Repo | |
Framework | |
What do character-level models learn about morphology? The case of dependency parsing
Title | What do character-level models learn about morphology? The case of dependency parsing |
Authors | Clara Vania, Andreas Grivas, Adam Lopez |
Abstract | When parsing morphologically-rich languages with neural models, it is beneficial to model input at the character level, and it has been claimed that this is because character-level models learn morphology. We test these claims by comparing character-level models to an oracle with access to explicit morphological analysis on twelve languages with varying morphological typologies. Our results highlight many strengths of character-level models, but also show that they are poor at disambiguating some words, particularly in the face of case syncretism. We then demonstrate that explicitly modeling morphological case improves our best model, showing that character-level models can benefit from targeted forms of explicit morphological modeling. |
Tasks | Dependency Parsing, Morphological Analysis |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09180v1 |
http://arxiv.org/pdf/1808.09180v1.pdf | |
PWC | https://paperswithcode.com/paper/what-do-character-level-models-learn-about |
Repo | |
Framework | |
Towards Dependable Deep Convolutional Neural Networks (CNNs) with Out-distribution Learning
Title | Towards Dependable Deep Convolutional Neural Networks (CNNs) with Out-distribution Learning |
Authors | Mahdieh Abbasi, Arezoo Rajabi, Christian Gagné, Rakesh B. Bobba |
Abstract | Detection and rejection of adversarial examples in security sensitive and safety-critical systems using deep CNNs is essential. In this paper, we propose an approach to augment CNNs with out-distribution learning in order to reduce misclassification rate by rejecting adversarial examples. We empirically show that our augmented CNNs can either reject or classify correctly most adversarial examples generated using well-known methods ( >95% for MNIST and >75% for CIFAR-10 on average). Furthermore, we achieve this without requiring to train using any specific type of adversarial examples and without sacrificing the accuracy of models on clean samples significantly (< 4%). |
Tasks | |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.08794v2 |
http://arxiv.org/pdf/1804.08794v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-dependable-deep-convolutional-neural |
Repo | |
Framework | |
Smart Inverter Grid Probing for Learning Loads: Part I - Identifiability Analysis
Title | Smart Inverter Grid Probing for Learning Loads: Part I - Identifiability Analysis |
Authors | Siddharth Bhela, Vassilis Kekatos, Sriharsha Veeramachaneni |
Abstract | Distribution grids currently lack comprehensive real-time metering. Nevertheless, grid operators require precise knowledge of loads and renewable generation to accomplish any feeder optimization task. At the same time, new grid technologies, such as solar photovoltaics and energy storage units are interfaced via inverters with advanced sensing and actuation capabilities. In this context, this two-part work puts forth the idea of engaging power electronics to probe an electric grid and record its voltage response at actuated and metered buses, to infer non-metered loads. Probing can be accomplished by commanding inverters to momentarily perturb their power injections. Multiple probing actions can be induced within a few tens of seconds. In Part I, load inference via grid probing is formulated as an implicit nonlinear system identification task, which is shown to be topologically observable under certain conditions. The conditions can be readily checked upon solving a max-flow problem on a bipartite graph derived from the feeder topology and the placement of actuated and non-metered buses. The analysis holds for single- and multi-phase grids, radial or meshed, and applies to phasor or magnitude-only voltage data. The topological observability of distribution systems using smart meter or phasor data is cast and analyzed a special case. |
Tasks | |
Published | 2018-06-22 |
URL | http://arxiv.org/abs/1806.08834v2 |
http://arxiv.org/pdf/1806.08834v2.pdf | |
PWC | https://paperswithcode.com/paper/smart-inverter-grid-probing-for-learning |
Repo | |
Framework | |
Stochastic Variational Optimization
Title | Stochastic Variational Optimization |
Authors | Thomas Bird, Julius Kunze, David Barber |
Abstract | Variational Optimization forms a differentiable upper bound on an objective. We show that approaches such as Natural Evolution Strategies and Gaussian Perturbation, are special cases of Variational Optimization in which the expectations are approximated by Gaussian sampling. These approaches are of particular interest because they are parallelizable. We calculate the approximate bias and variance of the corresponding gradient estimators and demonstrate that using antithetic sampling or a baseline is crucial to mitigate their problems. We contrast these methods with an alternative parallelizable method, namely Directional Derivatives. We conclude that, for differentiable objectives, using Directional Derivatives is preferable to using Variational Optimization to perform parallel Stochastic Gradient Descent. |
Tasks | |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.04855v1 |
http://arxiv.org/pdf/1809.04855v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-variational-optimization |
Repo | |
Framework | |
Using Machine Learning Safely in Automotive Software: An Assessment and Adaption of Software Process Requirements in ISO 26262
Title | Using Machine Learning Safely in Automotive Software: An Assessment and Adaption of Software Process Requirements in ISO 26262 |
Authors | Rick Salay, Krzysztof Czarnecki |
Abstract | The use of machine learning (ML) is on the rise in many sectors of software development, and automotive software development is no different. In particular, Advanced Driver Assistance Systems (ADAS) and Automated Driving Systems (ADS) are two areas where ML plays a significant role. In automotive development, safety is a critical objective, and the emergence of standards such as ISO 26262 has helped focus industry practices to address safety in a systematic and consistent way. Unfortunately, these standards were not designed to accommodate technologies such as ML or the type of functionality that is provided by an ADS and this has created a conflict between the need to innovate and the need to improve safety. In this report, we take steps to address this conflict by doing a detailed assessment and adaption of ISO 26262 for ML, specifically in the context of supervised learning. First we analyze the key factors that are the source of the conflict. Then we assess each software development process requirement (Part 6 of ISO 26262) for applicability to ML. Where there are gaps, we propose new requirements to address the gaps. Finally we discuss the application of this adapted and extended variant of Part 6 to ML development scenarios. |
Tasks | |
Published | 2018-08-05 |
URL | http://arxiv.org/abs/1808.01614v1 |
http://arxiv.org/pdf/1808.01614v1.pdf | |
PWC | https://paperswithcode.com/paper/using-machine-learning-safely-in-automotive |
Repo | |
Framework | |
Do Less, Get More: Streaming Submodular Maximization with Subsampling
Title | Do Less, Get More: Streaming Submodular Maximization with Subsampling |
Authors | Moran Feldman, Amin Karbasi, Ehsan Kazemi |
Abstract | In this paper, we develop the first one-pass streaming algorithm for submodular maximization that does not evaluate the entire stream even once. By carefully subsampling each element of data stream, our algorithm enjoys the tightest approximation guarantees in various settings while having the smallest memory footprint and requiring the lowest number of function evaluations. More specifically, for a monotone submodular function and a $p$-matchoid constraint, our randomized algorithm achieves a $4p$ approximation ratio (in expectation) with $O(k)$ memory and $O(km/p)$ queries per element ($k$ is the size of the largest feasible solution and $m$ is the number of matroids used to define the constraint). For the non-monotone case, our approximation ratio increases only slightly to $4p+2-o(1)$. To the best or our knowledge, our algorithm is the first that combines the benefits of streaming and subsampling in a novel way in order to truly scale submodular maximization to massive machine learning problems. To showcase its practicality, we empirically evaluated the performance of our algorithm on a video summarization application and observed that it outperforms the state-of-the-art algorithm by up to fifty fold, while maintaining practically the same utility. |
Tasks | Video Summarization |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.07098v1 |
http://arxiv.org/pdf/1802.07098v1.pdf | |
PWC | https://paperswithcode.com/paper/do-less-get-more-streaming-submodular |
Repo | |
Framework | |
DARKMENTION: A Deployed System to Predict Enterprise-Targeted External Cyberattacks
Title | DARKMENTION: A Deployed System to Predict Enterprise-Targeted External Cyberattacks |
Authors | Mohammed Almukaynizi, Ericsson Marin, Eric Nunes, Paulo Shakarian, Gerardo I. Simari, Dipsy Kapoor, Timothy Siedlecki |
Abstract | Recent incidents of data breaches call for organizations to proactively identify cyber attacks on their systems. Darkweb/Deepweb (D2web) forums and marketplaces provide environments where hackers anonymously discuss existing vulnerabilities and commercialize malicious software to exploit those vulnerabilities. These platforms offer security practitioners a threat intelligence environment that allows to mine for patterns related to organization-targeted cyber attacks. In this paper, we describe a system (called DARKMENTION) that learns association rules correlating indicators of attacks from D2web to real-world cyber incidents. Using the learned rules, DARKMENTION generates and submits warnings to a Security Operations Center (SOC) prior to attacks. Our goal was to design a system that automatically generates enterprise-targeted warnings that are timely, actionable, accurate, and transparent. We show that DARKMENTION meets our goal. In particular, we show that it outperforms baseline systems that attempt to generate warnings of cyber attacks related to two enterprises with an average increase in F1 score of about 45% and 57%. Additionally, DARKMENTION was deployed as part of a larger system that is built under a contract with the IARPA Cyber-attack Automated Unconventional Sensor Environment (CAUSE) program. It is actively producing warnings that precede attacks by an average of 3 days. |
Tasks | |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12492v1 |
http://arxiv.org/pdf/1810.12492v1.pdf | |
PWC | https://paperswithcode.com/paper/darkmention-a-deployed-system-to-predict |
Repo | |
Framework | |
Multi-Task Learning for Domain-General Spoken Disfluency Detection in Dialogue Systems
Title | Multi-Task Learning for Domain-General Spoken Disfluency Detection in Dialogue Systems |
Authors | Igor Shalyminov, Arash Eshghi, Oliver Lemon |
Abstract | Spontaneous spoken dialogue is often disfluent, containing pauses, hesitations, self-corrections and false starts. Processing such phenomena is essential in understanding a speaker’s intended meaning and controlling the flow of the conversation. Furthermore, this processing needs to be word-by-word incremental to allow further downstream processing to begin as early as possible in order to handle real spontaneous human conversational behaviour. In addition, from a developer’s point of view, it is highly desirable to be able to develop systems which can be trained from clean' examples while also able to generalise to the very diverse disfluent variations on the same data -- thereby enhancing both data-efficiency and robustness. In this paper, we present a multi-task LSTM-based model for incremental detection of disfluency structure, which can be hooked up to any component for incremental interpretation (e.g. an incremental semantic parser), or else simply used to clean up’ the current utterance as it is being produced. We train the system on the Switchboard Dialogue Acts (SWDA) corpus and present its accuracy on this dataset. Our model outperforms prior neural network-based incremental approaches by about 10 percentage points on SWDA while employing a simpler architecture. To test the model’s generalisation potential, we evaluate the same model on the bAbI+ dataset, without any additional training. bAbI+ is a dataset of synthesised goal-oriented dialogues where we control the distribution of disfluencies and their types. This shows that our approach has good generalisation potential, and sheds more light on which types of disfluency might be amenable to domain-general processing. |
Tasks | Multi-Task Learning |
Published | 2018-10-08 |
URL | http://arxiv.org/abs/1810.03352v1 |
http://arxiv.org/pdf/1810.03352v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-for-domain-general-spoken |
Repo | |
Framework | |