April 3, 2020

3373 words 16 mins read

Paper Group ANR 66

Toward Automated Virtual Assembly for Prefabricated Construction: Construction Sequencing through Simulated BIM. LowResourceEval-2019: a shared task on morphological analysis for low-resource languages. Decidability of cutpoint isolation for letter-monotonic probabilistic finite automata. Curriculum Audiovisual Learning. Learning Navigation Costs f …

Toward Automated Virtual Assembly for Prefabricated Construction: Construction Sequencing through Simulated BIM


Title	Toward Automated Virtual Assembly for Prefabricated Construction: Construction Sequencing through Simulated BIM
Authors	Gilmarie O’Neill, Matthew Ball, Yujing Liu, Mojtaba Noghabaei, Kevin Han
Abstract	To adhere to the stringent time and budget requirements of construction projects, contractors are utilizing prefabricated construction methods to expedite the construction process. Prefabricated construction methods require an adequate schedule and understanding by the contractors and constructors to be successful. The specificity of prefabricated construction often leads to inefficient scheduling and costly rework time. The designer, contractor, and constructors must have a strong understanding of the assembly process to experience the full benefits of the method. At the root of understanding the assembly process is visualizing how the process is intended to be performed. Currently, a virtual construction model is used to explain and better visualize the construction process. However, creating a virtual construction model is currently time consuming and requires experienced personnel. The proposed simulation of the virtual assembly will increase the automation of virtual construction modeling by implementing the data available in a building information modeling (BIM) model. This paper presents various factors (i.e., formalization of construction sequence based on the level of development (LOD)) that needs to be addressed for the development of automated virtual assembly. Two case studies are presented to demonstrate these factors.
Tasks
Published	2020-03-14
URL	https://arxiv.org/abs/2003.06695v1
PDF	https://arxiv.org/pdf/2003.06695v1.pdf
PWC	https://paperswithcode.com/paper/toward-automated-virtual-assembly-for
Repo
Framework

LowResourceEval-2019: a shared task on morphological analysis for low-resource languages


Title	LowResourceEval-2019: a shared task on morphological analysis for low-resource languages
Authors	Elena Klyachko, Alexey Sorokin, Natalia Krizhanovskaya, Andrew Krizhanovsky, Galina Ryazanskaya
Abstract	The paper describes the results of the first shared task on morphological analysis for the languages of Russia, namely, Evenki, Karelian, Selkup, and Veps. For the languages in question, only small-sized corpora are available. The tasks include morphological analysis, word form generation and morpheme segmentation. Four teams participated in the shared task. Most of them use machine-learning approaches, outperforming the existing rule-based ones. The article describes the datasets prepared for the shared tasks and contains analysis of the participants’ solutions. Language corpora having different formats were transformed into CONLL-U format. The universal format makes the datasets comparable to other language corpura and facilitates using them in other NLP tasks.
Tasks	Morphological Analysis
Published	2020-01-30
URL	https://arxiv.org/abs/2001.11285v1
PDF	https://arxiv.org/pdf/2001.11285v1.pdf
PWC	https://paperswithcode.com/paper/lowresourceeval-2019-a-shared-task-on
Repo
Framework

Decidability of cutpoint isolation for letter-monotonic probabilistic finite automata


Title	Decidability of cutpoint isolation for letter-monotonic probabilistic finite automata
Authors	Paul C. Bell, Pavel Semukhin
Abstract	We show the surprising result that the cutpoint isolation problem is decidable for probabilistic finite automata where input words are taken from a letter-monotonic context-free language. A context-free language $L$ is letter-monotonic when $L \subseteq a_1^a_2^ \cdots a_\ell^*$ for some finite $\ell > 0$ where each letter is distinct. A cutpoint is isolated when it cannot be approached arbitrarily closely. The decidability of this problem is in marked contrast to the situation for the (strict) emptiness problem for PFA which is undecidable under the even more severe restrictions of PFA with polynomial ambiguity, commutative matrices and input over a letter-monotonic language as well as the injectivity problem which is undecidable for PFA over letter-monotonic languages. We provide a constructive nondeterministic algorithm to solve the cutpoint isolation problem, even for exponentially ambiguous PFA, and we also show that the problem is at least NP-hard.
Tasks
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07660v1
PDF	https://arxiv.org/pdf/2002.07660v1.pdf
PWC	https://paperswithcode.com/paper/decidability-of-cutpoint-isolation-for-letter
Repo
Framework

Curriculum Audiovisual Learning


Title	Curriculum Audiovisual Learning
Authors	Di Hu, Zheng Wang, Haoyi Xiong, Dong Wang, Feiping Nie, Dejing Dou
Abstract	Associating sound and its producer in complex audiovisual scene is a challenging task, especially when we are lack of annotated training data. In this paper, we present a flexible audiovisual model that introduces a soft-clustering module as the audio and visual content detector, and regards the pervasive property of audiovisual concurrency as the latent supervision for inferring the correlation among detected contents. To ease the difficulty of audiovisual learning, we propose a novel curriculum learning strategy that trains the model from simple to complex scene. We show that such ordered learning procedure rewards the model the merits of easy training and fast convergence. Meanwhile, our audiovisual model can also provide effective unimodal representation and cross-modal alignment performance. We further deploy the well-trained model into practical audiovisual sound localization and separation task. We show that our localization model significantly outperforms existing methods, based on which we show comparable performance in sound separation without referring external visual supervision. Our video demo can be found at https://youtu.be/kuClfGG0cFU.
Tasks
Published	2020-01-26
URL	https://arxiv.org/abs/2001.09414v1
PDF	https://arxiv.org/pdf/2001.09414v1.pdf
PWC	https://paperswithcode.com/paper/curriculum-audiovisual-learning
Repo
Framework


Title	Learning Navigation Costs from Demonstration in Partially Observable Environments
Authors	Tianyu Wang, Vikas Dhiman, Nikolay Atanasov
Abstract	This paper focuses on inverse reinforcement learning (IRL) to enable safe and efficient autonomous navigation in unknown partially observable environments. The objective is to infer a cost function that explains expert-demonstrated navigation behavior while relying only on the observations and state-control trajectory used by the expert. We develop a cost function representation composed of two parts: a probabilistic occupancy encoder, with recurrent dependence on the observation sequence, and a cost encoder, defined over the occupancy features. The representation parameters are optimized by differentiating the error between demonstrated controls and a control policy computed from the cost encoder. Such differentiation is typically computed by dynamic programming through the value function over the whole state space. We observe that this is inefficient in large partially observable environments because most states are unexplored. Instead, we rely on a closed-form subgradient of the cost-to-go obtained only over a subset of promising states via an efficient motion-planning algorithm such as A* or RRT. Our experiments show that our model exceeds the accuracy of baseline IRL algorithms in robot navigation tasks, while substantially improving the efficiency of training and test-time inference.
Tasks	Autonomous Navigation, Motion Planning, Robot Navigation
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11637v1
PDF	https://arxiv.org/pdf/2002.11637v1.pdf
PWC	https://paperswithcode.com/paper/learning-navigation-costs-from-demonstration
Repo
Framework

A Data Efficient End-To-End Spoken Language Understanding Architecture


Title	A Data Efficient End-To-End Spoken Language Understanding Architecture
Authors	Marco Dinarelli, Nikita Kapoor, Bassam Jabaian, Laurent Besacier
Abstract	End-to-end architectures have been recently proposed for spoken language understanding (SLU) and semantic parsing. Based on a large amount of data, those models learn jointly acoustic and linguistic-sequential features. Such architectures give very good results in the context of domain, intent and slot detection, their application in a more complex semantic chunking and tagging task is less easy. For that, in many cases, models are combined with an external language model to enhance their performance. In this paper we introduce a data efficient system which is trained end-to-end, with no additional, pre-trained external module. One key feature of our approach is an incremental training procedure where acoustic, language and semantic models are trained sequentially one after the other. The proposed model has a reasonable size and achieves competitive results with respect to state-of-the-art while using a small training dataset. In particular, we reach 24.02% Concept Error Rate (CER) on MEDIA/test while training on MEDIA/train without any additional data.
Tasks	Chunking, Language Modelling, Semantic Parsing, Spoken Language Understanding
Published	2020-02-14
URL	https://arxiv.org/abs/2002.05955v1
PDF	https://arxiv.org/pdf/2002.05955v1.pdf
PWC	https://paperswithcode.com/paper/a-data-efficient-end-to-end-spoken-language
Repo
Framework

JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset


Title	JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset
Authors	Abhijeet Shenoi, Mihir Patel, JunYoung Gwak, Patrick Goebel, Amir Sadeghian, Hamid Rezatofighi, Roberto Martín-Martín, Silvio Savarese
Abstract	An autonomous navigating agent needs to perceive and track the motion of objects and other agents in its surroundings to achieve robust and safe motion planning and execution. While autonomous navigation requires a multi-object tracking (MOT) system to provide 3D information, most research has been done in 2D MOT from RGB videos. In this work we present JRMOT, a novel 3D MOT system that integrates information from 2D RGB images and 3D point clouds into a real-time performing framework. Our system leverages advancements in neural-network based re-identification as well as 2D and 3D detection and descriptors. We incorporate this into a joint probabilistic data-association framework within a multi-modal recursive Kalman architecture to achieve online, real-time 3D MOT. As part of our work, we release the JRDB dataset, a novel large scale 2D+3D dataset and benchmark annotated with over 2 million boxes and 3500 time consistent 2D+3D trajectories across 54 indoor and outdoor scenes. The dataset contains over 60 minutes of data including 360 degree cylindrical RGB video and 3D pointclouds. The presented 3D MOT system demonstrates state-of-the-art performance against competing methods on the popular 2D tracking KITTI benchmark and serves as a competitive 3D tracking baseline for our dataset and benchmark.
Tasks	Autonomous Navigation, Motion Planning, Multi-Object Tracking, Object Tracking
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08397v3
PDF	https://arxiv.org/pdf/2002.08397v3.pdf
PWC	https://paperswithcode.com/paper/jrmot-a-real-time-3d-multi-object-tracker-and
Repo
Framework

Extracting and Validating Explanatory Word Archipelagoes using Dual Entropy


Title	Extracting and Validating Explanatory Word Archipelagoes using Dual Entropy
Authors	Yukio Ohsawa, Teruaki Hayashi
Abstract	The logical connectivity of text is represented by the connectivity of words that form archipelagoes. Here, each archipelago is a sequence of islands of the occurrences of a certain word. An island here means the local sequence of sentences where the word is emphasized, and an archipelago of a length comparable to the target text is extracted using the co-variation of entropy A (the window-based entropy) on the distribution of the word’s occurrences with the width of each time window. Then, the logical connectivity of text is evaluated on entropy B (the graph-based entropy) computed on the distribution of sentences to connected word-clusters obtained on the co-occurrence of words. The results show the parts of the target text with words forming archipelagoes extracted on entropy A, without learned or prepared knowledge, form an explanatory part of the text that is of smaller entropy B than the parts extracted by the baseline methods.
Tasks
Published	2020-02-22
URL	https://arxiv.org/abs/2002.09581v1
PDF	https://arxiv.org/pdf/2002.09581v1.pdf
PWC	https://paperswithcode.com/paper/extracting-and-validating-explanatory-word
Repo
Framework

Real-Time Lane ID Estimation Using Recurrent Neural Networks With Dual Convention


Title	Real-Time Lane ID Estimation Using Recurrent Neural Networks With Dual Convention
Authors	Ibrahim Halfaoui, Fahd Bouzaraa, Onay Urfalioglu, Li Minzhen
Abstract	Acquiring information about the road lane structure is a crucial step for autonomous navigation. To this end, several approaches tackle this task from different perspectives such as lane marking detection or semantic lane segmentation. However, to the best of our knowledge, there is yet no purely vision based end-to-end solution to answer the precise question: How to estimate the relative number or “ID” of the current driven lane within a multi-lane road or a highway? In this work, we propose a real-time, vision-only (i.e. monocular camera) solution to the problem based on a dual left-right convention. We interpret this task as a classification problem by limiting the maximum number of lane candidates to eight. Our approach is designed to meet low-complexity specifications and limited runtime requirements. It harnesses the temporal dimension inherent to the input sequences to improve upon high-complexity state-of-the-art models. We achieve more than 95% accuracy on a challenging test set with extreme conditions and different routes.
Tasks	Autonomous Navigation
Published	2020-01-14
URL	https://arxiv.org/abs/2001.04708v1
PDF	https://arxiv.org/pdf/2001.04708v1.pdf
PWC	https://paperswithcode.com/paper/real-time-lane-id-estimation-using-recurrent
Repo
Framework

Compositional Neural Machine Translation by Removing the Lexicon from Syntax


Title	Compositional Neural Machine Translation by Removing the Lexicon from Syntax
Authors	Tristan Thrush
Abstract	The meaning of a natural language utterance is largely determined from its syntax and words. Additionally, there is evidence that humans process an utterance by separating knowledge about the lexicon from syntax knowledge. Theories from semantics and neuroscience claim that complete word meanings are not encoded in the representation of syntax. In this paper, we propose neural units that can enforce this constraint over an LSTM encoder and decoder. We demonstrate that our model achieves competitive performance across a variety of domains including semantic parsing, syntactic parsing, and English to Mandarin Chinese translation. In these cases, our model outperforms the standard LSTM encoder and decoder architecture on many or all of our metrics. To demonstrate that our model achieves the desired separation between the lexicon and syntax, we analyze its weights and explore its behavior when different neural modules are damaged. When damaged, we find that the model displays the knowledge distortions that aphasics are evidenced to have.
Tasks	Machine Translation, Semantic Parsing
Published	2020-02-06
URL	https://arxiv.org/abs/2002.08899v1
PDF	https://arxiv.org/pdf/2002.08899v1.pdf
PWC	https://paperswithcode.com/paper/compositional-neural-machine-translation-by
Repo
Framework

Modeling Historical AIS Data For Vessel Path Prediction: A Comprehensive Treatment


Title	Modeling Historical AIS Data For Vessel Path Prediction: A Comprehensive Treatment
Authors	Enmei Tu, Guanghao Zhang, Shangbo Mao, Lily Rachmawati, Guang-Bin Huang
Abstract	The prosperity of artificial intelligence has aroused intensive interests in intelligent/autonomous navigation, in which path prediction is a key functionality for decision supports, e.g. route planning, collision warning, and traffic regulation. For maritime intelligence, Automatic Identification System (AIS) plays an important role because it recently has been made compulsory for large international commercial vessels and is able to provide nearly real-time information of the vessel. Therefore AIS data based vessel path prediction is a promising way in future maritime intelligence. However, real-world AIS data collected online are just highly irregular trajectory segments (AIS message sequences) from different types of vessels and geographical regions, with possibly very low data quality. So even there are some works studying how to build a path prediction model using historical AIS data, but still, it is a very challenging problem. In this paper, we propose a comprehensive framework to model massive historical AIS trajectory segments for accurate vessel path prediction. Experimental comparisons with existing popular methods are made to validate the proposed approach and results show that our approach could outperform the baseline methods by a wide margin.
Tasks	Autonomous Navigation
Published	2020-01-02
URL	https://arxiv.org/abs/2001.01592v2
PDF	https://arxiv.org/pdf/2001.01592v2.pdf
PWC	https://paperswithcode.com/paper/modeling-historical-ais-data-for-vessel-path
Repo
Framework

Analysis and Evaluation of Handwriting in Patients with Parkinson’s Disease Using kinematic, Geometrical, and Non-linear Features


Title	Analysis and Evaluation of Handwriting in Patients with Parkinson’s Disease Using kinematic, Geometrical, and Non-linear Features
Authors	C. D. Rios-Urrego, J. C. Vásquez-Correa, J. F. Vargas-Bonilla, E. Nöth, F. Lopera, J. R. Orozco-Arroyave
Abstract	Background and objectives: Parkinson’s disease is a neurological disorder that affects the motor system producing lack of coordination, resting tremor, and rigidity. Impairments in handwriting are among the main symptoms of the disease. Handwriting analysis can help in supporting the diagnosis and in monitoring the progress of the disease. This paper aims to evaluate the importance of different groups of features to model handwriting deficits that appear due to Parkinson’s disease; and how those features are able to discriminate between Parkinson’s disease patients and healthy subjects. Methods: Features based on kinematic, geometrical and non-linear dynamics analyses were evaluated to classify Parkinson’s disease and healthy subjects. Classifiers based on K-nearest neighbors, support vector machines, and random forest were considered. Results: Accuracies of up to $93.1%$ were obtained in the classification of patients and healthy control subjects. A relevance analysis of the features indicated that those related to speed, acceleration, and pressure are the most discriminant. The automatic classification of patients in different stages of the disease shows $\kappa$ indexes between $0.36$ and $0.44$. Accuracies of up to $83.3%$ were obtained in a different dataset used only for validation purposes. Conclusions: The results confirmed the negative impact of aging in the classification process when we considered different groups of healthy subjects. In addition, the results reported with the separate validation set comprise a step towards the development of automated tools to support the diagnosis process in clinical practice.
Tasks
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05411v1
PDF	https://arxiv.org/pdf/2002.05411v1.pdf
PWC	https://paperswithcode.com/paper/analysis-and-evaluation-of-handwriting-in
Repo
Framework

From Natural Language Instructions to Complex Processes: Issues in Chaining Trigger Action Rules


Title	From Natural Language Instructions to Complex Processes: Issues in Chaining Trigger Action Rules
Authors	Nobuhiro Ito, Yuya Suzuki, Akiko Aizawa
Abstract	Automation services for complex business processes usually require a high level of information technology literacy. There is a strong demand for a smartly assisted process automation (IPA: intelligent process automation) service that enables even general users to easily use advanced automation. A natural language interface for such automation is expected as an elemental technology for the IPA realization. The workflow targeted by IPA is generally composed of a combination of multiple tasks. However, semantic parsing, one of the natural language processing methods, for such complex workflows has not yet been fully studied. The reasons are that (1) the formal expression and grammar of the workflow required for semantic analysis have not been sufficiently examined and (2) the dataset of the workflow formal expression with its corresponding natural language description required for learning workflow semantics did not exist. This paper defines a new grammar for complex workflows with chaining machine-executable meaning representations for semantic parsing. The representations are at a high abstraction level. Additionally, an approach to creating datasets is proposed based on this grammar.
Tasks	Semantic Parsing
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02462v1
PDF	https://arxiv.org/pdf/2001.02462v1.pdf
PWC	https://paperswithcode.com/paper/from-natural-language-instructions-to-complex
Repo
Framework

Graph Signal Processing – Part III: Machine Learning on Graphs, from Graph Topology to Applications


Title	Graph Signal Processing – Part III: Machine Learning on Graphs, from Graph Topology to Applications
Authors	Ljubisa Stankovic, Danilo Mandic, Milos Dakovic, Milos Brajovic, Bruno Scalzo, Shengxi Li, Anthony G. Constantinides
Abstract	Many modern data analytics applications on graphs operate on domains where graph topology is not known a priori, and hence its determination becomes part of the problem definition, rather than serving as prior knowledge which aids the problem solution. Part III of this monograph starts by addressing ways to learn graph topology, from the case where the physics of the problem already suggest a possible topology, through to most general cases where the graph topology is learned from the data. A particular emphasis is on graph topology definition based on the correlation and precision matrices of the observed data, combined with additional prior knowledge and structural conditions, such as the smoothness or sparsity of graph connections. For learning sparse graphs (with small number of edges), the least absolute shrinkage and selection operator, known as LASSO is employed, along with its graph specific variant, graphical LASSO. For completeness, both variants of LASSO are derived in an intuitive way, and explained. An in-depth elaboration of the graph topology learning paradigm is provided through several examples on physically well defined graphs, such as electric circuits, linear heat transfer, social and computer networks, and spring-mass systems. As many graph neural networks (GNN) and convolutional graph networks (GCN) are emerging, we have also reviewed the main trends in GNNs and GCNs, from the perspective of graph signal filtering. Tensor representation of lattice-structured graphs is next considered, and it is shown that tensors (multidimensional data arrays) are a special class of graph signals, whereby the graph vertices reside on a high-dimensional regular lattice structure. This part of monograph concludes with two emerging applications in financial data processing and underground transportation networks modeling.
Tasks
Published	2020-01-02
URL	https://arxiv.org/abs/2001.00426v1
PDF	https://arxiv.org/pdf/2001.00426v1.pdf
PWC	https://paperswithcode.com/paper/graph-signal-processing-part-iii-machine
Repo
Framework

Anomaly Detection in Beehives using Deep Recurrent Autoencoders


Title	Anomaly Detection in Beehives using Deep Recurrent Autoencoders
Authors	Padraig Davidson, Michael Steininger, Florian Lautenschlager, Konstantin Kobs, Anna Krause, Andreas Hotho
Abstract	Precision beekeeping allows to monitor bees’ living conditions by equipping beehives with sensors. The data recorded by these hives can be analyzed by machine learning models to learn behavioral patterns of or search for unusual events in bee colonies. One typical target is the early detection of bee swarming as apiarists want to avoid this due to economical reasons. Advanced methods should be able to detect any other unusual or abnormal behavior arising from illness of bees or from technical reasons, e.g. sensor failure. In this position paper we present an autoencoder, a deep learning model, which detects any type of anomaly in data independent of its origin. Our model is able to reveal the same swarms as a simple rule-based swarm detection algorithm but is also triggered by any other anomaly. We evaluated our model on real world data sets that were collected on different hives and with different sensor setups.
Tasks	Anomaly Detection
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04576v1
PDF	https://arxiv.org/pdf/2003.04576v1.pdf
PWC	https://paperswithcode.com/paper/anomaly-detection-in-beehives-using-deep
Repo
Framework