April 1, 2020

3046 words 15 mins read

Paper Group ANR 519

Paper Group ANR 519

A Critique on the Interventional Detection of Causal Relationships. Word2Vec: Optimal Hyper-Parameters and Their Impact on NLP Downstream Tasks. Multi-User Remote lab: Timetable Scheduling Using Simplex Nondominated Sorting Genetic Algorithm. NSURL-2019 Task 7: Named Entity Recognition (NER) in Farsi. Robot Calligraphy using Pseudospectral Optimal …

A Critique on the Interventional Detection of Causal Relationships

Title A Critique on the Interventional Detection of Causal Relationships
Authors Mehrzad Saremi
Abstract Interventions are of fundamental importance in Pearl’s probabilistic causality regime. In this paper, we will inspect how interventions influence the interpretation of causation in causal models in specific situation. To this end, we will introduce a priori relationships as non-causal relationships in a causal system. Then, we will proceed to discuss the cases that interventions can lead to spurious causation interpretations. This includes the interventional detection of a priori relationships, and cases where the interventional detection of causality forms structural causal models that are not valid in natural situations. We will also discuss other properties of a priori relations and SCMs that have a priori information in their structural equations.
Tasks
Published 2020-03-26
URL https://arxiv.org/abs/2003.11706v1
PDF https://arxiv.org/pdf/2003.11706v1.pdf
PWC https://paperswithcode.com/paper/a-critique-on-the-interventional-detection-of
Repo
Framework

Word2Vec: Optimal Hyper-Parameters and Their Impact on NLP Downstream Tasks

Title Word2Vec: Optimal Hyper-Parameters and Their Impact on NLP Downstream Tasks
Authors Tosin P. Adewumi, Foteini Liwicki, Marcus Liwicki
Abstract Word2Vec is a prominent tool for Natural Language Processing (NLP) tasks. Similar inspiration is found in distributed embeddings for state-of-the-art (sota) deep neural networks. However, wrong combination of hyper-parameters can produce poor quality vectors. The objective of this work is to show optimal combination of hyper-parameters exists and evaluate various combinations. We compare them with the original model released by Mikolov. Both intrinsic and extrinsic (downstream) evaluations, including Named Entity Recognition (NER) and Sentiment Analysis (SA) were carried out. The downstream tasks reveal that the best model is task-specific, high analogy scores don’t necessarily correlate positively with F1 scores and the same applies for more data. Increasing vector dimension size after a point leads to poor quality or performance. If ethical considerations to save time, energy and the environment are made, then reasonably smaller corpora may do just as well or even better in some cases. Besides, using a small corpus, we obtain better human-assigned WordSim scores, corresponding Spearman correlation and better downstream (NER & SA) performance compared to Mikolov’s model, trained on 100 billion word corpus.
Tasks Named Entity Recognition, Sentiment Analysis
Published 2020-03-23
URL https://arxiv.org/abs/2003.11645v1
PDF https://arxiv.org/pdf/2003.11645v1.pdf
PWC https://paperswithcode.com/paper/word2vec-optimal-hyper-parameters-and-their
Repo
Framework

Multi-User Remote lab: Timetable Scheduling Using Simplex Nondominated Sorting Genetic Algorithm

Title Multi-User Remote lab: Timetable Scheduling Using Simplex Nondominated Sorting Genetic Algorithm
Authors Seid Miad Zandavi, Vera Chung, Ali Anaissi
Abstract The scheduling of multi-user remote laboratories is modeled as a multimodal function for the proposed optimization algorithm. The hybrid optimization algorithm, hybridization of the Nelder-Mead Simplex algorithm and Non-dominated Sorting Genetic Algorithm (NSGA), is proposed to optimize the timetable problem for the remote laboratories to coordinate shared access. The proposed algorithm utilizes the Simplex algorithm in terms of exploration, and NSGA for sorting local optimum points with consideration of potential areas. The proposed algorithm is applied to difficult nonlinear continuous multimodal functions, and its performance is compared with hybrid Simplex Particle Swarm Optimization, Simplex Genetic Algorithm, and other heuristic algorithms.
Tasks
Published 2020-03-26
URL https://arxiv.org/abs/2003.11708v1
PDF https://arxiv.org/pdf/2003.11708v1.pdf
PWC https://paperswithcode.com/paper/multi-user-remote-lab-timetable-scheduling
Repo
Framework

NSURL-2019 Task 7: Named Entity Recognition (NER) in Farsi

Title NSURL-2019 Task 7: Named Entity Recognition (NER) in Farsi
Authors Nasrin Taghizadeh, Zeinab Borhanifard, Melika GolestaniPour, Heshaam Faili
Abstract NSURL-2019 Task 7 focuses on Named Entity Recognition (NER) in Farsi. This task was chosen to compare different approaches to find phrases that specify Named Entities in Farsi texts, and to establish a standard testbed for future researches on this task in Farsi. This paper describes the process of making training and test data, a list of participating teams (6 teams), and evaluation results of their systems. The best system obtained 85.4% of F1 score based on phrase-level evaluation on seven classes of NEs including person, organization, location, date, time, money and percent.
Tasks Named Entity Recognition
Published 2020-03-19
URL https://arxiv.org/abs/2003.09029v1
PDF https://arxiv.org/pdf/2003.09029v1.pdf
PWC https://paperswithcode.com/paper/nsurl-2019-task-7-named-entity-recognition
Repo
Framework

Robot Calligraphy using Pseudospectral Optimal Control in Conjunction with a Novel Dynamic Brush Model

Title Robot Calligraphy using Pseudospectral Optimal Control in Conjunction with a Novel Dynamic Brush Model
Authors Sen Wang, Jiaqi Chen, Xuanliang Deng, Seth Hutchinson, Frank Dellaert
Abstract Chinese calligraphy is a unique art form with great artistic value but difficult to master. In this paper, we formulate the calligraphy writing problem as a trajectory optimization problem, and propose an improved virtual brush model for simulating the real writing process. Our approach is inspired by pseudospectral optimal control in that we parameterize the actuator trajectory for each stroke as a Chebyshev polynomial. The proposed dynamic virtual brush model plays a key role in formulating the objective function to be optimized. Our approach shows excellent performance in drawing aesthetically pleasing characters, and does so much more efficiently than previous work, opening up the possibility to achieve real-time closed-loop control.
Tasks
Published 2020-03-02
URL https://arxiv.org/abs/2003.01565v1
PDF https://arxiv.org/pdf/2003.01565v1.pdf
PWC https://paperswithcode.com/paper/robot-calligraphy-using-pseudospectral
Repo
Framework

Multimodal Controller for Generative Models

Title Multimodal Controller for Generative Models
Authors Enmao Diao, Jie Ding, Vahid Tarokh
Abstract Class-conditional generative models are crucial tools for data generation from user-specified class labels. A number of existing approaches for class-conditional generative models require nontrivial modifications of existing architectures, in order to model conditional information fed into the model. In this paper, we introduce a new method called multimodal controller to generate multimodal data without introducing additional model parameters. With the proposed technique, the model can be trained easily from non-conditional generative models by simply attaching controllers at each layer. Each controller grants label-specific model parameters. Thus the proposed method does not require additional model complexity. In the absence of the controllers, our model reduces to non-conditional generative models. Numerical experiments demonstrate the effectiveness of our proposed method in comparison with those of the existing non-conditional and conditional generative models. Additionally, our numerical results demonstrate that a small portion (10%) of label-specific model parameters is required to generate class-conditional MNIST and FashionMNIST images.
Tasks
Published 2020-02-07
URL https://arxiv.org/abs/2002.02572v1
PDF https://arxiv.org/pdf/2002.02572v1.pdf
PWC https://paperswithcode.com/paper/multimodal-controller-for-generative-models
Repo
Framework

Modelling High-Dimensional Categorical Data Using Nonconvex Fusion Penalties

Title Modelling High-Dimensional Categorical Data Using Nonconvex Fusion Penalties
Authors Benjamin G. Stokell, Rajen D. Shah, Ryan J. Tibshirani
Abstract We propose a method for estimation in high-dimensional linear models with nominal categorical data. Our estimator, called SCOPE, fuses levels together by making their corresponding coefficients exactly equal. This is achieved using the minimax concave penalty on differences between the order statistics of the coefficients for a categorical variable, thereby clustering the coefficients. We provide an algorithm for exact and efficient computation of the global minimum of the resulting nonconvex objective in the case with a single variable with potentially many levels, and use this within a block coordinate descent procedure in the multivariate case. We show that an oracle least squares solution that exploits the unknown level fusions is a limit point of the coordinate descent with high probability, provided the true levels have a certain minimum separation; these conditions are known to be minimal in the univariate case. We demonstrate the favourable performance of SCOPE across a range of real and simulated datasets. An R package CatReg implementing SCOPE for linear models and also a version for logistic regression is available on CRAN.
Tasks
Published 2020-02-28
URL https://arxiv.org/abs/2002.12606v1
PDF https://arxiv.org/pdf/2002.12606v1.pdf
PWC https://paperswithcode.com/paper/modelling-high-dimensional-categorical-data
Repo
Framework

TraLFM: Latent Factor Modeling of Traffic Trajectory Data

Title TraLFM: Latent Factor Modeling of Traffic Trajectory Data
Authors Meng Chen, Xiaohui Yu, Yang Liu
Abstract The widespread use of positioning devices (e.g., GPS) has given rise to a vast body of human movement data, often in the form of trajectories. Understanding human mobility patterns could benefit many location-based applications. In this paper, we propose a novel generative model called TraLFM via latent factor modeling to mine human mobility patterns underlying traffic trajectories. TraLFM is based on three key observations: (1) human mobility patterns are reflected by the sequences of locations in the trajectories; (2) human mobility patterns vary with people; and (3) human mobility patterns tend to be cyclical and change over time. Thus, TraLFM models the joint action of sequential, personal and temporal factors in a unified way, and brings a new perspective to many applications such as latent factor analysis and next location prediction. We perform thorough empirical studies on two real datasets, and the experimental results confirm that TraLFM outperforms the state-of-the-art methods significantly in these applications.
Tasks
Published 2020-03-16
URL https://arxiv.org/abs/2003.07780v1
PDF https://arxiv.org/pdf/2003.07780v1.pdf
PWC https://paperswithcode.com/paper/tralfm-latent-factor-modeling-of-traffic
Repo
Framework

Learning Depth via Interaction

Title Learning Depth via Interaction
Authors Antonio Loquercio, Alexey Dosovitskiy, Davide Scaramuzza
Abstract Motivated by the astonishing capabilities of natural intelligent agents and inspired by theories from psychology, this paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment. Existing works for depth estimation require either massive amounts of annotated training data or some form of hard-coded geometrical constraint. This paper explores a new approach to learning depth perception requiring neither of those. Specifically, we train a specialized global-local network architecture with what would be available to a robot interacting with the environment: from extremely sparse depth measurements down to even a single pixel per image. From a pair of consecutive images, our proposed network outputs a latent representation of the observer’s motion between the images and a dense depth map. Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches. We believe that this work, despite its scientific interest, lays the foundations to learn depth from extremely sparse supervision, which can be valuable to all robotic systems acting under severe bandwidth or sensing constraints.
Tasks Depth Estimation
Published 2020-03-02
URL https://arxiv.org/abs/2003.00752v1
PDF https://arxiv.org/pdf/2003.00752v1.pdf
PWC https://paperswithcode.com/paper/learning-depth-via-interaction
Repo
Framework

Hierarchical Kinematic Human Mesh Recovery

Title Hierarchical Kinematic Human Mesh Recovery
Authors Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Kosecka, Ziyan Wu
Abstract We consider the problem of estimating a parametric model of 3D human mesh from a single image. While there has been substantial recent progress in this area with direct regression of model parameters, these methods only implicitly exploit the human body kinematic structure, leading to sub-optimal use of the model prior. In this work, we address this gap by proposing a new technique for regression of human parametric model that is explicitly informed by the known hierarchical structure, including joint interdependencies of the model. This results in a strong prior-informed design of the regressor architecture and an associated hierarchical optimization that is flexible to be used in conjunction with the current standard frameworks for 3D human mesh recovery. We demonstrate these aspects by means of extensive experiments on standard benchmark datasets, showing how our proposed new design outperforms several existing and popular methods, establishing new state-of-the-art results. With our explicit consideration of joint interdependencies, our proposed method is equipped to infer joints even under data corruptions, which we demonstrate with experiments under varying degrees of occlusion.
Tasks
Published 2020-03-09
URL https://arxiv.org/abs/2003.04232v1
PDF https://arxiv.org/pdf/2003.04232v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-kinematic-human-mesh-recovery
Repo
Framework

T2FSNN: Deep Spiking Neural Networks with Time-to-first-spike Coding

Title T2FSNN: Deep Spiking Neural Networks with Time-to-first-spike Coding
Authors Seongsik Park, Seijoon Kim, Byunggook Na, Sungroh Yoon
Abstract Spiking neural networks (SNNs) have gained considerable interest due to their energy-efficient characteristics, yet lack of a scalable training algorithm has restricted their applicability in practical machine learning problems. The deep neural network-to-SNN conversion approach has been widely studied to broaden the applicability of SNNs. Most previous studies, however, have not fully utilized spatio-temporal aspects of SNNs, which has led to inefficiency in terms of number of spikes and inference latency. In this paper, we present T2FSNN, which introduces the concept of time-to-first-spike coding into deep SNNs using the kernel-based dynamic threshold and dendrite to overcome the aforementioned drawback. In addition, we propose gradient-based optimization and early firing methods to further increase the efficiency of the T2FSNN. According to our results, the proposed methods can reduce inference latency and number of spikes to 22% and less than 1%, compared to those of burst coding, which is the state-of-the-art result on the CIFAR-100.
Tasks
Published 2020-03-26
URL https://arxiv.org/abs/2003.11741v1
PDF https://arxiv.org/pdf/2003.11741v1.pdf
PWC https://paperswithcode.com/paper/t2fsnn-deep-spiking-neural-networks-with-time
Repo
Framework

6DoF Object Pose Estimation via Differentiable Proxy Voting Loss

Title 6DoF Object Pose Estimation via Differentiable Proxy Voting Loss
Authors Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li
Abstract Estimating a 6DOF object pose from a single image is very challenging due to occlusions or textureless appearances. Vector-field based keypoint voting has demonstrated its effectiveness and superiority on tackling those issues. However, direct regression of vector-fields neglects that the distances between pixels and keypoints also affect the deviations of hypotheses dramatically. In other words, small errors in direction vectors may generate severely deviated hypotheses when pixels are far away from a keypoint. In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective. To this end, we develop a simple yet effective differentiable proxy voting loss (DPVL) which mimics the hypothesis selection in the voting procedure. By exploiting our voting loss, we are able to train our network in an end-to-end manner. Experiments on widely used datasets, i.e. LINEMOD and Occlusion LINEMOD, manifest that our DPVL improves pose estimation performance significantly and speeds up the training convergence.
Tasks Pose Estimation
Published 2020-02-10
URL https://arxiv.org/abs/2002.03923v1
PDF https://arxiv.org/pdf/2002.03923v1.pdf
PWC https://paperswithcode.com/paper/6dof-object-pose-estimation-via
Repo
Framework

Matching Neuromorphic Events and Color Images via Adversarial Learning

Title Matching Neuromorphic Events and Color Images via Adversarial Learning
Authors Fang Xu, Shijie Lin, Wen Yang, Lei Yu, Dengxin Dai, Gui-song Xia
Abstract The event camera has appealing properties: high dynamic range, low latency, low power consumption and low memory usage, and thus provides complementariness to conventional frame-based cameras. It only captures the dynamics of a scene and is able to capture almost “continuous” motion. However, different from frame-based camera that reflects the whole appearance as scenes are, the event camera casts away the detailed characteristics of objects, such as texture and color. To take advantages of both modalities, the event camera and frame-based camera are combined together for various machine vision tasks. Then the cross-modal matching between neuromorphic events and color images plays a vital and essential role. In this paper, we propose the Event-Based Image Retrieval (EBIR) problem to exploit the cross-modal matching task. Given an event stream depicting a particular object as query, the aim is to retrieve color images containing the same object. This problem is challenging because there exists a large modality gap between neuromorphic events and color images. We address the EBIR problem by proposing neuromorphic Events-Color image Feature Learning (ECFL). Particularly, the adversarial learning is employed to jointly model neuromorphic events and color images into a common embedding space. We also contribute to the community N-UKbench and EC180 dataset to promote the development of EBIR problem. Extensive experiments on our datasets show that the proposed method is superior in learning effective modality-invariant representation to link two different modalities.
Tasks Image Retrieval
Published 2020-03-02
URL https://arxiv.org/abs/2003.00636v1
PDF https://arxiv.org/pdf/2003.00636v1.pdf
PWC https://paperswithcode.com/paper/matching-neuromorphic-events-and-color-images
Repo
Framework

Data-Driven Prediction Model of Components Shift during Reflow Process in Surface Mount Technology

Title Data-Driven Prediction Model of Components Shift during Reflow Process in Surface Mount Technology
Authors Irandokht Parviziomran, Shun Cao, Krishnaswami Srihari, Daehan Won
Abstract In surface mount technology (SMT), mounted components on soldered pads are subject to move during reflow process. This capability is known as self-alignment and is the result of fluid dynamic behaviour of molten solder paste. This capability is critical in SMT because inaccurate self-alignment causes defects such as overhanging, tombstoning, etc. while on the other side, it can enable components to be perfectly self-assembled on or near the desire position. The aim of this study is to develop a machine learning model that predicts the components movement during reflow in x and y-directions as well as rotation. Our study is composed of two steps: (1) experimental data are studied to reveal the relationships between self-alignment and various factors including component geometry, pad geometry, etc. (2) advanced machine learning prediction models are applied to predict the distance and the direction of components shift using support vector regression (SVR), neural network (NN), and random forest regression (RFR). As a result, RFR can predict components shift with the average fitness of 99%, 99%, and 96% and with average prediction error of 13.47 (um), 12.02 (um), and 1.52 (deg.) for component shift in x, y, and rotational directions, respectively. This enhancement provides the future capability of the parameters’ optimization in the pick and placement machine to control the best placement location and minimize the intrinsic defects caused by the self-alignment.
Tasks
Published 2020-01-27
URL https://arxiv.org/abs/2001.09619v1
PDF https://arxiv.org/pdf/2001.09619v1.pdf
PWC https://paperswithcode.com/paper/data-driven-prediction-model-of-components
Repo
Framework

MOT20: A benchmark for multi object tracking in crowded scenes

Title MOT20: A benchmark for multi object tracking in crowded scenes
Authors Patrick Dendorfer, Hamid Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, Ian Reid, Stefan Roth, Konrad Schindler, Laura Leal-Taixé
Abstract Standardized benchmarks are crucial for the majority of computer vision applications. Although leaderboards and ranking tables should not be over-claimed, benchmarks often provide the most objective measure of performance and are therefore important guides for research. The benchmark for Multiple Object Tracking, MOTChallenge, was launched with the goal to establish a standardized evaluation of multiple object tracking methods. The challenge focuses on multiple people tracking, since pedestrians are well studied in the tracking community, and precise tracking and detection has high practical relevance. Since the first release, MOT15, MOT16, and MOT17 have tremendously contributed to the community by introducing a clean dataset and precise framework to benchmark multi-object trackers. In this paper, we present our MOT20benchmark, consisting of 8 new sequences depicting very crowded challenging scenes. The benchmark was presented first at the 4thBMTT MOT Challenge Workshop at the Computer Vision and Pattern Recognition Conference (CVPR) 2019, and gives to chance to evaluate state-of-the-art methods for multiple object tracking when handling extremely crowded scenarios.
Tasks Multi-Object Tracking, Multiple Object Tracking, Multiple People Tracking, Object Tracking
Published 2020-03-19
URL https://arxiv.org/abs/2003.09003v1
PDF https://arxiv.org/pdf/2003.09003v1.pdf
PWC https://paperswithcode.com/paper/mot20-a-benchmark-for-multi-object-tracking
Repo
Framework
comments powered by Disqus