January 29, 2020

2958 words 14 mins read

Paper Group ANR 504

Unsupervised Discovery of Parts, Structure, and Dynamics. Stratum: A Serverless Framework for Lifecycle Management of Machine Learning based Data Analytics Tasks. A Multi-Domain Feature Learning Method for Visual Place Recognition. Sampling Humans for Optimizing Preferences in Coloring Artwork. INTERACTION Dataset: An INTERnational, Adversarial and …

Unsupervised Discovery of Parts, Structure, and Dynamics


Title	Unsupervised Discovery of Parts, Structure, and Dynamics
Authors	Zhenjia Xu, Zhijian Liu, Chen Sun, Kevin Murphy, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
Abstract	Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future. In this paper, we propose a novel formulation that simultaneously learns a hierarchical, disentangled object representation and a dynamics model for object parts from unlabeled videos. Our Parts, Structure, and Dynamics (PSD) model learns to, first, recognize the object parts via a layered image representation; second, predict hierarchy via a structural descriptor that composes low-level concepts into a hierarchical structure; and third, model the system dynamics by predicting the future. Experiments on multiple real and synthetic datasets demonstrate that our PSD model works well on all three tasks: segmenting object parts, building their hierarchical structure, and capturing their motion distributions.
Tasks
Published	2019-03-12
URL	http://arxiv.org/abs/1903.05136v1
PDF	http://arxiv.org/pdf/1903.05136v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-discovery-of-parts-structure-and
Repo
Framework

Stratum: A Serverless Framework for Lifecycle Management of Machine Learning based Data Analytics Tasks


Title	Stratum: A Serverless Framework for Lifecycle Management of Machine Learning based Data Analytics Tasks
Authors	Anirban Bhattacharjee, Yogesh Barve, Shweta Khare, Shunxing Bao, Aniruddha Gokhale, Thomas Damiano
Abstract	With the proliferation of machine learning (ML) libraries and frameworks, and the programming languages that they use, along with operations of data loading, transformation, preparation and mining, ML model development is becoming a daunting task. Furthermore, with a plethora of cloud-based ML model development platforms, heterogeneity in hardware, increased focus on exploiting edge computing resources for low-latency prediction serving and often a lack of a complete understanding of resources required to execute ML workflows efficiently, ML model deployment demands expertise for managing the lifecycle of ML workflows efficiently and with minimal cost. To address these challenges, we propose an end-to-end data analytics, a serverless platform called Stratum. Stratum can deploy, schedule and dynamically manage data ingestion tools, live streaming apps, batch analytics tools, ML-as-a-service (for inference jobs), and visualization tools across the cloud-fog-edge spectrum. This paper describes the Stratum architecture highlighting the problems it resolves.
Tasks
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01727v1
PDF	http://arxiv.org/pdf/1904.01727v1.pdf
PWC	https://paperswithcode.com/paper/stratum-a-serverless-framework-for-lifecycle
Repo
Framework

A Multi-Domain Feature Learning Method for Visual Place Recognition


Title	A Multi-Domain Feature Learning Method for Visual Place Recognition
Authors	Peng Yin, Lingyun Xu, Xueqian Li, Chen Yin, Yingli Li, Rangaprasad Arun Srivatsan, Lu Li, Jianmin Ji, Yuqing He
Abstract	Visual Place Recognition (VPR) is an important component in both computer vision and robotics applications, thanks to its ability to determine whether a place has been visited and where specifically. A major challenge in VPR is to handle changes of environmental conditions including weather, season and illumination. Most VPR methods try to improve the place recognition performance by ignoring the environmental factors, leading to decreased accuracy decreases when environmental conditions change significantly, such as day versus night. To this end, we propose an end-to-end conditional visual place recognition method. Specifically, we introduce the multi-domain feature learning method (MDFL) to capture multiple attribute-descriptions for a given place, and then use a feature detaching module to separate the environmental condition-related features from those that are not. The only label required within this feature learning pipeline is the environmental condition. Evaluation of the proposed method is conducted on the multi-season \textit{NORDLAND} dataset, and the multi-weather \textit{GTAV} dataset. Experimental results show that our method improves the feature robustness against variant environmental conditions.
Tasks	Visual Place Recognition
Published	2019-02-26
URL	http://arxiv.org/abs/1902.10058v1
PDF	http://arxiv.org/pdf/1902.10058v1.pdf
PWC	https://paperswithcode.com/paper/a-multi-domain-feature-learning-method-for
Repo
Framework

Sampling Humans for Optimizing Preferences in Coloring Artwork


Title	Sampling Humans for Optimizing Preferences in Coloring Artwork
Authors	Michael McCourt, Ian Dewancker
Abstract	Many circumstances of practical importance have performance or success metrics which exist implicitly—in the eye of the beholder, so to speak. Tuning aspects of such problems requires working without defined metrics and only considering pairwise comparisons or rankings. In this paper, we review an existing Bayesian optimization strategy for determining most-preferred outcomes, and identify an adaptation to allow it to handle ties. We then discuss some of the issues we have encountered when humans use this optimization strategy to optimize coloring a piece of abstract artwork. We hope that, by participating in this workshop, we can learn how other researchers encounter difficulties unique to working with humans in the loop.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03813v1
PDF	https://arxiv.org/pdf/1906.03813v1.pdf
PWC	https://paperswithcode.com/paper/sampling-humans-for-optimizing-preferences-in
Repo
Framework

INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps


Title	INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps
Authors	Wei Zhan, Liting Sun, Di Wang, Haojie Shi, Aubrey Clausse, Maximilian Naumann, Julius Kummerle, Hendrik Konigshof, Christoph Stiller, Arnaud de La Fortelle, Masayoshi Tomizuka
Abstract	Behavior-related research areas such as motion prediction/planning, representation/imitation learning, behavior modeling/generation, and algorithm testing, require support from high-quality motion datasets containing interactive driving scenarios with different driving cultures. In this paper, we present an INTERnational, Adversarial and Cooperative moTION dataset (INTERACTION dataset) in interactive driving scenarios with semantic maps. Five features of the dataset are highlighted. 1) The interactive driving scenarios are diverse, including urban/highway/ramp merging and lane changes, roundabouts with yield/stop signs, signalized intersections, intersections with one/two/all-way stops, etc. 2) Motion data from different countries and different continents are collected so that driving preferences and styles in different cultures are naturally included. 3) The driving behavior is highly interactive and complex with adversarial and cooperative motions of various traffic participants. Highly complex behavior such as negotiations, aggressive/irrational decisions and traffic rule violations are densely contained in the dataset, while regular behavior can also be found from cautious car-following, stop, left/right/U-turn to rational lane-change and cycling and pedestrian crossing, etc. 4) The levels of criticality span wide, from regular safe operations to dangerous, near-collision maneuvers. Real collision, although relatively slight, is also included. 5) Maps with complete semantic information are provided with physical layers, reference lines, lanelet connections and traffic rules. The data is recorded from drones and traffic cameras. Statistics of the dataset in terms of number of entities and interaction density are also provided, along with some utilization examples in a variety of behavior-related research areas. The dataset can be downloaded via https://interaction-dataset.com.
Tasks	Imitation Learning, motion prediction
Published	2019-09-30
URL	https://arxiv.org/abs/1910.03088v1
PDF	https://arxiv.org/pdf/1910.03088v1.pdf
PWC	https://paperswithcode.com/paper/interaction-dataset-an-international
Repo
Framework

Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures


Title	Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures
Authors	João Antunes, Pedro Abreu, Alexandre Bernardino, Asim Smailagic, Daniel Siewiorek
Abstract	Action detection and recognition tasks have been the target of much focus in the computer vision community due to their many applications, namely, security, robotics and recommendation systems. Recently, datasets like AVA, provide multi-person, multi-label, spatiotemporal action detection and recognition challenges. Being unable to discern which portions of the input to use for classification is a limitation of two-stream CNN approaches, once the vision task involves several people with several labels. We address this limitation and improve the state-of-the-art performance of two-stream CNNs. In this paper we present four contributions: our fovea attention filtering that highlights targets for classification without discarding background; a generalized binary loss function designed for the AVA dataset; miniAVA, a partition of AVA that maintains temporal continuity and class distribution with only one tenth of the dataset size; and ablation studies on alternative attention filters. Our method, using fovea attention filtering and our generalized binary loss, achieves a relative video mAP improvement of 20% over the two-stream baseline in AVA, and is competitive with the state-of-the-art in the UCF101-24. We also show a relative video mAP improvement of 12.6% when using our generalized binary loss over the standard sum-of-sigmoids.
Tasks	Action Detection, Recommendation Systems
Published	2019-07-21
URL	https://arxiv.org/abs/1907.12919v1
PDF	https://arxiv.org/pdf/1907.12919v1.pdf
PWC	https://paperswithcode.com/paper/attention-filtering-for-multi-person
Repo
Framework

Object-Driven Multi-Layer Scene Decomposition From a Single Image


Title	Object-Driven Multi-Layer Scene Decomposition From a Single Image
Authors	Helisa Dhamo, Nassir Navab, Federico Tombari
Abstract	We present a method that tackles the challenge of predicting color and depth behind the visible content of an image. Our approach aims at building up a Layered Depth Image (LDI) from a single RGB input, which is an efficient representation that arranges the scene in layers, including originally occluded regions. Unlike previous work, we enable an adaptive scheme for the number of layers and incorporate semantic encoding for better hallucination of partly occluded objects. Additionally, our approach is object-driven, which especially boosts the accuracy for the occluded intermediate objects. The framework consists of two steps. First, we individually complete each object in terms of color and depth, while estimating the scene layout. Second, we rebuild the scene based on the regressed layers and enforce the recomposed image to resemble the structure of the original input. The learned representation enables various applications, such as 3D photography and diminished reality, all from a single RGB image.
Tasks
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09521v1
PDF	https://arxiv.org/pdf/1908.09521v1.pdf
PWC	https://paperswithcode.com/paper/object-driven-multi-layer-scene-decomposition
Repo
Framework

Distributed Policy Learning Based Random Access for Diversified QoS Requirements


Title	Distributed Policy Learning Based Random Access for Diversified QoS Requirements
Authors	Zhiyuan Jiang, Sheng Zhou, Zhisheng Niu
Abstract	Future wireless access networks need to support diversified quality of service (QoS) metrics required by various types of Internet-of-Things (IoT) devices, e.g., age of information (AoI) for status generating sources and ultra low latency for safety information in vehicular networks. In this paper, a novel inner-state driven random access (ISDA) framework is proposed based on distributed policy learning, in particular a cross-entropy method. Conventional random access schemes, e.g., $p$-CSMA, assume state-less terminals, and thus assigning equal priorities to all. In ISDA, the inner-states of terminals are described by a time-varying state vector, and the transmission probabilities of terminals in the contention period are determined by their respective inner-states. Neural networks are leveraged to approximate the function mappings from inner-states to transmission probabilities, and an iterative approach is adopted to improve these mappings in a distributed manner. Experiment results show that ISDA can improve the QoS of heterogeneous terminals simultaneously compared to conventional CSMA schemes.
Tasks
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02242v1
PDF	http://arxiv.org/pdf/1903.02242v1.pdf
PWC	https://paperswithcode.com/paper/distributed-policy-learning-based-random
Repo
Framework

Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture


Title	Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture
Authors	Ashok Thillaisundaram, Theodosia Togia
Abstract	This paper presents our participation in the AGAC Track from the 2019 BioNLP Open Shared Tasks. We provide a solution for Task 3, which aims to extract “gene - function change - disease” triples, where “gene” and “disease” are mentions of particular genes and diseases respectively and “function change” is one of four pre-defined relationship types. Our system extends BERT (Devlin et al., 2018), a state-of-the-art language model, which learns contextual language representations from a large unlabelled corpus and whose parameters can be fine-tuned to solve specific tasks with minimal additional architecture. We encode the pair of mentions and their textual context as two consecutive sequences in BERT, separated by a special symbol. We then use a single linear layer to classify their relationship into five classes (four pre-defined, as well as ‘no relation’). Despite considerable class imbalance, our system significantly outperforms a random baseline while relying on an extremely simple setup with no specially engineered features.
Tasks	Language Modelling, Relation Extraction
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12411v1
PDF	https://arxiv.org/pdf/1909.12411v1.pdf
PWC	https://paperswithcode.com/paper/biomedical-relation-extraction-with-pre
Repo
Framework

Automated Classification of Helium Ingress in Irradiated X-750


Title	Automated Classification of Helium Ingress in Irradiated X-750
Authors	Chris Anderson, Jacob Klein, Heygaan Rajakumar, Colin Judge, Laurent K Beland
Abstract	Imaging nanoscale features using transmission electron microscopy is key to predicting and assessing the mechanical behavior of structural materials in nuclear reactors. Analyzing these micrographs is often a tedious and time-consuming manual process, making this analysis is a prime candidate for automation. A region-based convolutional neural network is proposed, which can identify helium bubbles in neutron-irradiated Inconel X-750 reactor spacer springs. We demonstrate that this neural network produces analyses of similar accuracy and reproducibility than that produced by humans. Further, we show this method as being four orders of magnitude faster than manual analysis allowing for generation of significant quantities of data. The proposed method can be used with micrographs of different Fresnel contrasts and resolutions and shows promise in application across multiple defect types.
Tasks
Published	2019-12-09
URL	https://arxiv.org/abs/1912.04252v1
PDF	https://arxiv.org/pdf/1912.04252v1.pdf
PWC	https://paperswithcode.com/paper/automated-classification-of-helium-ingress-in
Repo
Framework

Design of Real-time Semantic Segmentation Decoder for Automated Driving


Title	Design of Real-time Semantic Segmentation Decoder for Automated Driving
Authors	Arindam Das, Saranya Kandan, Senthil Yogamani, Pavel Krizek
Abstract	Semantic segmentation remains a computationally intensive algorithm for embedded deployment even with the rapid growth of computation power. Thus efficient network design is a critical aspect especially for applications like automated driving which requires real-time performance. Recently, there has been a lot of research on designing efficient encoders that are mostly task agnostic. Unlike image classification and bounding box object detection tasks, decoders are computationally expensive as well for semantic segmentation task. In this work, we focus on efficient design of the segmentation decoder and assume that an efficient encoder is already designed to provide shared features for a multi-task learning system. We design a novel efficient non-bottleneck layer and a family of decoders which fit into a small run-time budget using VGG10 as efficient encoder. We demonstrate in our dataset that experimentation with various design choices led to an improvement of 10% from a baseline performance.
Tasks	Image Classification, Multi-Task Learning, Object Detection, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-01-19
URL	http://arxiv.org/abs/1901.06580v1
PDF	http://arxiv.org/pdf/1901.06580v1.pdf
PWC	https://paperswithcode.com/paper/design-of-real-time-semantic-segmentation
Repo
Framework

optimalFlow: Optimal-transport approach to flow cytometry gating and population matching


Title	optimalFlow: Optimal-transport approach to flow cytometry gating and population matching
Authors	Eustasio del Barrio, Hristo Inouzhe, Jean-Michel Loubes, Carlos Matrán, Agustín Mayo-Íscar
Abstract	Data used in Flow Cytometry present pronounced variability due to biological and technical reasons. Biological variability is a well known phenomenon produced by measurements on different individuals, with different characteristics such as age, sex, etc… The use of different settings for measurement, the variation of the conditions during experiments or the different types of flow cytometers are some of the technical sources of variability. This high variability makes difficult the use of supervised machine learning for identification of cell populations. We propose optimalFlowTemplates, based on a similarity distance and Wasserstein barycenters, which clusterizes cytometries and produces prototype cytometries for the different groups. We show that supervised learning restricted to the new groups performs better than the same techniques applied to the whole collection. We also present optimalFlowClassification, which uses a database of gated cytometries and optimalFlowTemplates to assign cell types to a new cytometry. We show that this procedure can outperform state of the art techniques in the proposed datasets. Our code and data are freely available as R packages at https://github.com/HristoInouzhe/optimalFlow and https://github.com/HristoInouzhe/optimalFlowData.
Tasks
Published	2019-07-18
URL	https://arxiv.org/abs/1907.08006v1
PDF	https://arxiv.org/pdf/1907.08006v1.pdf
PWC	https://paperswithcode.com/paper/optimalflow-optimal-transport-approach-to
Repo
Framework

Anticipation and next action forecasting in video: an end-to-end model with memory


Title	Anticipation and next action forecasting in video: an end-to-end model with memory
Authors	Fiora Pirri, Lorenzo Mauro, Edoardo Alati, Valsamis Ntouskos, Mahdieh Izadpanahkakhk, Elham Omrani
Abstract	Action anticipation and forecasting in videos do not require a hat-trick, as far as there are signs in the context to foresee how actions are going to be deployed. Capturing these signs is hard because the context includes the past. We propose an end-to-end network for action anticipation and forecasting with memory, to both anticipate the current action and foresee the next one. Experiments on action sequence datasets show excellent results indicating that training on histories with a dynamic memory can significantly improve forecasting performance.
Tasks
Published	2019-01-11
URL	http://arxiv.org/abs/1901.03728v1
PDF	http://arxiv.org/pdf/1901.03728v1.pdf
PWC	https://paperswithcode.com/paper/anticipation-and-next-action-forecasting-in
Repo
Framework

Improving Conditioning in Context-Aware Sequence to Sequence Models


Title	Improving Conditioning in Context-Aware Sequence to Sequence Models
Authors	Xinyi Wang, Jason Weston, Michael Auli, Yacine Jernite
Abstract	Neural sequence to sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on cases where generation is conditioned on both a short query and a long context, such as abstractive question answering or document-level translation. We modify the standard sequence-to-sequence approach to make better use of both the query and the context by expanding the conditioning mechanism to intertwine query and context attention. We also introduce a simple and efficient data augmentation method for the proposed model. Experiments on three different tasks show that both changes lead to consistent improvements.
Tasks	Data Augmentation, Question Answering
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09728v1
PDF	https://arxiv.org/pdf/1911.09728v1.pdf
PWC	https://paperswithcode.com/paper/improving-conditioning-in-context-aware
Repo
Framework

Query Expansion for Cross-Language Question Re-Ranking


Title	Query Expansion for Cross-Language Question Re-Ranking
Authors	Muhammad Mahbubur Rahman, Sorami Hisamoto, Kevin Duh
Abstract	Community question-answering (CQA) platforms have become very popular forums for asking and answering questions daily. While these forums are rich repositories of community knowledge, they present challenges for finding relevant answers and similar questions, due to the open-ended nature of informal discussions. Further, if the platform allows questions and answers in multiple languages, we are faced with the additional challenge of matching cross-lingual information. In this work, we focus on the cross-language question re-ranking shared task, which aims to find existing questions that may be written in different languages. Our contribution is an exploration of query expansion techniques for this problem. We investigate expansions based on Word Embeddings, DBpedia concepts linking, and Hypernym, and show that they outperform existing state-of-the-art methods.
Tasks	Community Question Answering, Question Answering, Word Embeddings
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07982v1
PDF	http://arxiv.org/pdf/1904.07982v1.pdf
PWC	https://paperswithcode.com/paper/query-expansion-for-cross-language-question
Repo
Framework