Paper Group ANR 504
Unsupervised Discovery of Parts, Structure, and Dynamics. Stratum: A Serverless Framework for Lifecycle Management of Machine Learning based Data Analytics Tasks. A Multi-Domain Feature Learning Method for Visual Place Recognition. Sampling Humans for Optimizing Preferences in Coloring Artwork. INTERACTION Dataset: An INTERnational, Adversarial and …
Unsupervised Discovery of Parts, Structure, and Dynamics
Title | Unsupervised Discovery of Parts, Structure, and Dynamics |
Authors | Zhenjia Xu, Zhijian Liu, Chen Sun, Kevin Murphy, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu |
Abstract | Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future. In this paper, we propose a novel formulation that simultaneously learns a hierarchical, disentangled object representation and a dynamics model for object parts from unlabeled videos. Our Parts, Structure, and Dynamics (PSD) model learns to, first, recognize the object parts via a layered image representation; second, predict hierarchy via a structural descriptor that composes low-level concepts into a hierarchical structure; and third, model the system dynamics by predicting the future. Experiments on multiple real and synthetic datasets demonstrate that our PSD model works well on all three tasks: segmenting object parts, building their hierarchical structure, and capturing their motion distributions. |
Tasks | |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.05136v1 |
http://arxiv.org/pdf/1903.05136v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-discovery-of-parts-structure-and |
Repo | |
Framework | |
Stratum: A Serverless Framework for Lifecycle Management of Machine Learning based Data Analytics Tasks
Title | Stratum: A Serverless Framework for Lifecycle Management of Machine Learning based Data Analytics Tasks |
Authors | Anirban Bhattacharjee, Yogesh Barve, Shweta Khare, Shunxing Bao, Aniruddha Gokhale, Thomas Damiano |
Abstract | With the proliferation of machine learning (ML) libraries and frameworks, and the programming languages that they use, along with operations of data loading, transformation, preparation and mining, ML model development is becoming a daunting task. Furthermore, with a plethora of cloud-based ML model development platforms, heterogeneity in hardware, increased focus on exploiting edge computing resources for low-latency prediction serving and often a lack of a complete understanding of resources required to execute ML workflows efficiently, ML model deployment demands expertise for managing the lifecycle of ML workflows efficiently and with minimal cost. To address these challenges, we propose an end-to-end data analytics, a serverless platform called Stratum. Stratum can deploy, schedule and dynamically manage data ingestion tools, live streaming apps, batch analytics tools, ML-as-a-service (for inference jobs), and visualization tools across the cloud-fog-edge spectrum. This paper describes the Stratum architecture highlighting the problems it resolves. |
Tasks | |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.01727v1 |
http://arxiv.org/pdf/1904.01727v1.pdf | |
PWC | https://paperswithcode.com/paper/stratum-a-serverless-framework-for-lifecycle |
Repo | |
Framework | |
A Multi-Domain Feature Learning Method for Visual Place Recognition
Title | A Multi-Domain Feature Learning Method for Visual Place Recognition |
Authors | Peng Yin, Lingyun Xu, Xueqian Li, Chen Yin, Yingli Li, Rangaprasad Arun Srivatsan, Lu Li, Jianmin Ji, Yuqing He |
Abstract | Visual Place Recognition (VPR) is an important component in both computer vision and robotics applications, thanks to its ability to determine whether a place has been visited and where specifically. A major challenge in VPR is to handle changes of environmental conditions including weather, season and illumination. Most VPR methods try to improve the place recognition performance by ignoring the environmental factors, leading to decreased accuracy decreases when environmental conditions change significantly, such as day versus night. To this end, we propose an end-to-end conditional visual place recognition method. Specifically, we introduce the multi-domain feature learning method (MDFL) to capture multiple attribute-descriptions for a given place, and then use a feature detaching module to separate the environmental condition-related features from those that are not. The only label required within this feature learning pipeline is the environmental condition. Evaluation of the proposed method is conducted on the multi-season \textit{NORDLAND} dataset, and the multi-weather \textit{GTAV} dataset. Experimental results show that our method improves the feature robustness against variant environmental conditions. |
Tasks | Visual Place Recognition |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.10058v1 |
http://arxiv.org/pdf/1902.10058v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-domain-feature-learning-method-for |
Repo | |
Framework | |
Sampling Humans for Optimizing Preferences in Coloring Artwork
Title | Sampling Humans for Optimizing Preferences in Coloring Artwork |
Authors | Michael McCourt, Ian Dewancker |
Abstract | Many circumstances of practical importance have performance or success metrics which exist implicitly—in the eye of the beholder, so to speak. Tuning aspects of such problems requires working without defined metrics and only considering pairwise comparisons or rankings. In this paper, we review an existing Bayesian optimization strategy for determining most-preferred outcomes, and identify an adaptation to allow it to handle ties. We then discuss some of the issues we have encountered when humans use this optimization strategy to optimize coloring a piece of abstract artwork. We hope that, by participating in this workshop, we can learn how other researchers encounter difficulties unique to working with humans in the loop. |
Tasks | |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.03813v1 |
https://arxiv.org/pdf/1906.03813v1.pdf | |
PWC | https://paperswithcode.com/paper/sampling-humans-for-optimizing-preferences-in |
Repo | |
Framework | |
INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps
Title | INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps |
Authors | Wei Zhan, Liting Sun, Di Wang, Haojie Shi, Aubrey Clausse, Maximilian Naumann, Julius Kummerle, Hendrik Konigshof, Christoph Stiller, Arnaud de La Fortelle, Masayoshi Tomizuka |
Abstract | Behavior-related research areas such as motion prediction/planning, representation/imitation learning, behavior modeling/generation, and algorithm testing, require support from high-quality motion datasets containing interactive driving scenarios with different driving cultures. In this paper, we present an INTERnational, Adversarial and Cooperative moTION dataset (INTERACTION dataset) in interactive driving scenarios with semantic maps. Five features of the dataset are highlighted. 1) The interactive driving scenarios are diverse, including urban/highway/ramp merging and lane changes, roundabouts with yield/stop signs, signalized intersections, intersections with one/two/all-way stops, etc. 2) Motion data from different countries and different continents are collected so that driving preferences and styles in different cultures are naturally included. 3) The driving behavior is highly interactive and complex with adversarial and cooperative motions of various traffic participants. Highly complex behavior such as negotiations, aggressive/irrational decisions and traffic rule violations are densely contained in the dataset, while regular behavior can also be found from cautious car-following, stop, left/right/U-turn to rational lane-change and cycling and pedestrian crossing, etc. 4) The levels of criticality span wide, from regular safe operations to dangerous, near-collision maneuvers. Real collision, although relatively slight, is also included. 5) Maps with complete semantic information are provided with physical layers, reference lines, lanelet connections and traffic rules. The data is recorded from drones and traffic cameras. Statistics of the dataset in terms of number of entities and interaction density are also provided, along with some utilization examples in a variety of behavior-related research areas. The dataset can be downloaded via https://interaction-dataset.com. |
Tasks | Imitation Learning, motion prediction |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1910.03088v1 |
https://arxiv.org/pdf/1910.03088v1.pdf | |
PWC | https://paperswithcode.com/paper/interaction-dataset-an-international |
Repo | |
Framework | |
Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures
Title | Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures |
Authors | João Antunes, Pedro Abreu, Alexandre Bernardino, Asim Smailagic, Daniel Siewiorek |
Abstract | Action detection and recognition tasks have been the target of much focus in the computer vision community due to their many applications, namely, security, robotics and recommendation systems. Recently, datasets like AVA, provide multi-person, multi-label, spatiotemporal action detection and recognition challenges. Being unable to discern which portions of the input to use for classification is a limitation of two-stream CNN approaches, once the vision task involves several people with several labels. We address this limitation and improve the state-of-the-art performance of two-stream CNNs. In this paper we present four contributions: our fovea attention filtering that highlights targets for classification without discarding background; a generalized binary loss function designed for the AVA dataset; miniAVA, a partition of AVA that maintains temporal continuity and class distribution with only one tenth of the dataset size; and ablation studies on alternative attention filters. Our method, using fovea attention filtering and our generalized binary loss, achieves a relative video mAP improvement of 20% over the two-stream baseline in AVA, and is competitive with the state-of-the-art in the UCF101-24. We also show a relative video mAP improvement of 12.6% when using our generalized binary loss over the standard sum-of-sigmoids. |
Tasks | Action Detection, Recommendation Systems |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.12919v1 |
https://arxiv.org/pdf/1907.12919v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-filtering-for-multi-person |
Repo | |
Framework | |
Object-Driven Multi-Layer Scene Decomposition From a Single Image
Title | Object-Driven Multi-Layer Scene Decomposition From a Single Image |
Authors | Helisa Dhamo, Nassir Navab, Federico Tombari |
Abstract | We present a method that tackles the challenge of predicting color and depth behind the visible content of an image. Our approach aims at building up a Layered Depth Image (LDI) from a single RGB input, which is an efficient representation that arranges the scene in layers, including originally occluded regions. Unlike previous work, we enable an adaptive scheme for the number of layers and incorporate semantic encoding for better hallucination of partly occluded objects. Additionally, our approach is object-driven, which especially boosts the accuracy for the occluded intermediate objects. The framework consists of two steps. First, we individually complete each object in terms of color and depth, while estimating the scene layout. Second, we rebuild the scene based on the regressed layers and enforce the recomposed image to resemble the structure of the original input. The learned representation enables various applications, such as 3D photography and diminished reality, all from a single RGB image. |
Tasks | |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09521v1 |
https://arxiv.org/pdf/1908.09521v1.pdf | |
PWC | https://paperswithcode.com/paper/object-driven-multi-layer-scene-decomposition |
Repo | |
Framework | |
Distributed Policy Learning Based Random Access for Diversified QoS Requirements
Title | Distributed Policy Learning Based Random Access for Diversified QoS Requirements |
Authors | Zhiyuan Jiang, Sheng Zhou, Zhisheng Niu |
Abstract | Future wireless access networks need to support diversified quality of service (QoS) metrics required by various types of Internet-of-Things (IoT) devices, e.g., age of information (AoI) for status generating sources and ultra low latency for safety information in vehicular networks. In this paper, a novel inner-state driven random access (ISDA) framework is proposed based on distributed policy learning, in particular a cross-entropy method. Conventional random access schemes, e.g., $p$-CSMA, assume state-less terminals, and thus assigning equal priorities to all. In ISDA, the inner-states of terminals are described by a time-varying state vector, and the transmission probabilities of terminals in the contention period are determined by their respective inner-states. Neural networks are leveraged to approximate the function mappings from inner-states to transmission probabilities, and an iterative approach is adopted to improve these mappings in a distributed manner. Experiment results show that ISDA can improve the QoS of heterogeneous terminals simultaneously compared to conventional CSMA schemes. |
Tasks | |
Published | 2019-03-06 |
URL | http://arxiv.org/abs/1903.02242v1 |
http://arxiv.org/pdf/1903.02242v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-policy-learning-based-random |
Repo | |
Framework | |
Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture
Title | Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture |
Authors | Ashok Thillaisundaram, Theodosia Togia |
Abstract | This paper presents our participation in the AGAC Track from the 2019 BioNLP Open Shared Tasks. We provide a solution for Task 3, which aims to extract “gene - function change - disease” triples, where “gene” and “disease” are mentions of particular genes and diseases respectively and “function change” is one of four pre-defined relationship types. Our system extends BERT (Devlin et al., 2018), a state-of-the-art language model, which learns contextual language representations from a large unlabelled corpus and whose parameters can be fine-tuned to solve specific tasks with minimal additional architecture. We encode the pair of mentions and their textual context as two consecutive sequences in BERT, separated by a special symbol. We then use a single linear layer to classify their relationship into five classes (four pre-defined, as well as ‘no relation’). Despite considerable class imbalance, our system significantly outperforms a random baseline while relying on an extremely simple setup with no specially engineered features. |
Tasks | Language Modelling, Relation Extraction |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12411v1 |
https://arxiv.org/pdf/1909.12411v1.pdf | |
PWC | https://paperswithcode.com/paper/biomedical-relation-extraction-with-pre |
Repo | |
Framework | |
Automated Classification of Helium Ingress in Irradiated X-750
Title | Automated Classification of Helium Ingress in Irradiated X-750 |
Authors | Chris Anderson, Jacob Klein, Heygaan Rajakumar, Colin Judge, Laurent K Beland |
Abstract | Imaging nanoscale features using transmission electron microscopy is key to predicting and assessing the mechanical behavior of structural materials in nuclear reactors. Analyzing these micrographs is often a tedious and time-consuming manual process, making this analysis is a prime candidate for automation. A region-based convolutional neural network is proposed, which can identify helium bubbles in neutron-irradiated Inconel X-750 reactor spacer springs. We demonstrate that this neural network produces analyses of similar accuracy and reproducibility than that produced by humans. Further, we show this method as being four orders of magnitude faster than manual analysis allowing for generation of significant quantities of data. The proposed method can be used with micrographs of different Fresnel contrasts and resolutions and shows promise in application across multiple defect types. |
Tasks | |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.04252v1 |
https://arxiv.org/pdf/1912.04252v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-classification-of-helium-ingress-in |
Repo | |
Framework | |
Design of Real-time Semantic Segmentation Decoder for Automated Driving
Title | Design of Real-time Semantic Segmentation Decoder for Automated Driving |
Authors | Arindam Das, Saranya Kandan, Senthil Yogamani, Pavel Krizek |
Abstract | Semantic segmentation remains a computationally intensive algorithm for embedded deployment even with the rapid growth of computation power. Thus efficient network design is a critical aspect especially for applications like automated driving which requires real-time performance. Recently, there has been a lot of research on designing efficient encoders that are mostly task agnostic. Unlike image classification and bounding box object detection tasks, decoders are computationally expensive as well for semantic segmentation task. In this work, we focus on efficient design of the segmentation decoder and assume that an efficient encoder is already designed to provide shared features for a multi-task learning system. We design a novel efficient non-bottleneck layer and a family of decoders which fit into a small run-time budget using VGG10 as efficient encoder. We demonstrate in our dataset that experimentation with various design choices led to an improvement of 10% from a baseline performance. |
Tasks | Image Classification, Multi-Task Learning, Object Detection, Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-01-19 |
URL | http://arxiv.org/abs/1901.06580v1 |
http://arxiv.org/pdf/1901.06580v1.pdf | |
PWC | https://paperswithcode.com/paper/design-of-real-time-semantic-segmentation |
Repo | |
Framework | |
optimalFlow: Optimal-transport approach to flow cytometry gating and population matching
Title | optimalFlow: Optimal-transport approach to flow cytometry gating and population matching |
Authors | Eustasio del Barrio, Hristo Inouzhe, Jean-Michel Loubes, Carlos Matrán, Agustín Mayo-Íscar |
Abstract | Data used in Flow Cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well known phenomenon produced by measurements on different individuals, with different characteristics such as age, sex, etc… The use of different settings for measurement, the variation of the conditions during experiments or the different types of flow cytometers are some of the technical sources of variability. This high variability makes difficult the use of supervised machine learning for identification of cell populations. We propose optimalFlowTemplates, based on a similarity distance and Wasserstein barycenters, which clusterizes cytometries and produces prototype cytometries for the different groups. We show that supervised learning restricted to the new groups performs better than the same techniques applied to the whole collection. We also present optimalFlowClassification, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show that this procedure can outperform state of the art techniques in the proposed datasets. Our code and data are freely available as R packages at https://github.com/HristoInouzhe/optimalFlow and https://github.com/HristoInouzhe/optimalFlowData. |
Tasks | |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08006v1 |
https://arxiv.org/pdf/1907.08006v1.pdf | |
PWC | https://paperswithcode.com/paper/optimalflow-optimal-transport-approach-to |
Repo | |
Framework | |
Anticipation and next action forecasting in video: an end-to-end model with memory
Title | Anticipation and next action forecasting in video: an end-to-end model with memory |
Authors | Fiora Pirri, Lorenzo Mauro, Edoardo Alati, Valsamis Ntouskos, Mahdieh Izadpanahkakhk, Elham Omrani |
Abstract | Action anticipation and forecasting in videos do not require a hat-trick, as far as there are signs in the context to foresee how actions are going to be deployed. Capturing these signs is hard because the context includes the past. We propose an end-to-end network for action anticipation and forecasting with memory, to both anticipate the current action and foresee the next one. Experiments on action sequence datasets show excellent results indicating that training on histories with a dynamic memory can significantly improve forecasting performance. |
Tasks | |
Published | 2019-01-11 |
URL | http://arxiv.org/abs/1901.03728v1 |
http://arxiv.org/pdf/1901.03728v1.pdf | |
PWC | https://paperswithcode.com/paper/anticipation-and-next-action-forecasting-in |
Repo | |
Framework | |
Improving Conditioning in Context-Aware Sequence to Sequence Models
Title | Improving Conditioning in Context-Aware Sequence to Sequence Models |
Authors | Xinyi Wang, Jason Weston, Michael Auli, Yacine Jernite |
Abstract | Neural sequence to sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on cases where generation is conditioned on both a short query and a long context, such as abstractive question answering or document-level translation. We modify the standard sequence-to-sequence approach to make better use of both the query and the context by expanding the conditioning mechanism to intertwine query and context attention. We also introduce a simple and efficient data augmentation method for the proposed model. Experiments on three different tasks show that both changes lead to consistent improvements. |
Tasks | Data Augmentation, Question Answering |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09728v1 |
https://arxiv.org/pdf/1911.09728v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-conditioning-in-context-aware |
Repo | |
Framework | |
Query Expansion for Cross-Language Question Re-Ranking
Title | Query Expansion for Cross-Language Question Re-Ranking |
Authors | Muhammad Mahbubur Rahman, Sorami Hisamoto, Kevin Duh |
Abstract | Community question-answering (CQA) platforms have become very popular forums for asking and answering questions daily. While these forums are rich repositories of community knowledge, they present challenges for finding relevant answers and similar questions, due to the open-ended nature of informal discussions. Further, if the platform allows questions and answers in multiple languages, we are faced with the additional challenge of matching cross-lingual information. In this work, we focus on the cross-language question re-ranking shared task, which aims to find existing questions that may be written in different languages. Our contribution is an exploration of query expansion techniques for this problem. We investigate expansions based on Word Embeddings, DBpedia concepts linking, and Hypernym, and show that they outperform existing state-of-the-art methods. |
Tasks | Community Question Answering, Question Answering, Word Embeddings |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07982v1 |
http://arxiv.org/pdf/1904.07982v1.pdf | |
PWC | https://paperswithcode.com/paper/query-expansion-for-cross-language-question |
Repo | |
Framework | |