April 2, 2020

3065 words 15 mins read

Paper Group ANR 347

Paper Group ANR 347

Learning Distributional Programs for Relational Autocompletion. From Anchor Generation to Distribution Alignment: Learning a Discriminative Embedding Space for Zero-Shot Recognition. Domain segmentation and adjustment for generalized zero-shot learning. Zero-Shot Activity Recognition with Videos. Stacked Boosters Network Architecture for Short Term …

Learning Distributional Programs for Relational Autocompletion

Title Learning Distributional Programs for Relational Autocompletion
Authors Kumar Nitesh, Kuzelka Ondrej, De Raedt Luc
Abstract Relational autocompletion is the problem of automatically filling out some missing fields in a relational database. We tackle this problem within the probabilistic logic programming framework of Distributional Clauses (DC), which supports both discrete and continuous probability distributions. Within this framework, we introduce Dreaml – an approach to learn both the structure and the parameters of DC programs from databases that may contain missing information. To realize this, Dreaml integrates statistical modeling, distributional clauses with rule learning. The distinguishing features of Dreaml are that it 1) tackles relational autocompletion, 2) learns distributional clauses extended with statistical models, 3) deals with both discrete and continuous distributions, 4) can exploit background knowledge, and 5) uses an expectation-maximization based algorithm to cope with missing data.
Tasks
Published 2020-01-23
URL https://arxiv.org/abs/2001.08603v1
PDF https://arxiv.org/pdf/2001.08603v1.pdf
PWC https://paperswithcode.com/paper/learning-distributional-programs-for
Repo
Framework

From Anchor Generation to Distribution Alignment: Learning a Discriminative Embedding Space for Zero-Shot Recognition

Title From Anchor Generation to Distribution Alignment: Learning a Discriminative Embedding Space for Zero-Shot Recognition
Authors Fuzhen Li, Zhenfeng Zhu, Xingxing Zhang, Jian Cheng, Yao Zhao
Abstract In zero-shot learning (ZSL), the samples to be classified are usually projected into side information templates such as attributes. However, the irregular distribution of templates makes classification results confused. To alleviate this issue, we propose a novel framework called Discriminative Anchor Generation and Distribution Alignment Model (DAGDA). Firstly, in order to rectify the distribution of original templates, a diffusion based graph convolutional network, which can explicitly model the interaction between class and side information, is proposed to produce discriminative anchors. Secondly, to further align the samples with the corresponding anchors in anchor space, which aims to refine the distribution in a fine-grained manner, we introduce a semantic relation regularization in anchor space. Following the way of inductive learning, our approach outperforms some existing state-of-the-art methods, on several benchmark datasets, for both conventional as well as generalized ZSL setting. Meanwhile, the ablation experiments strongly demonstrate the effectiveness of each component.
Tasks Zero-Shot Learning
Published 2020-02-10
URL https://arxiv.org/abs/2002.03554v1
PDF https://arxiv.org/pdf/2002.03554v1.pdf
PWC https://paperswithcode.com/paper/from-anchor-generation-to-distribution
Repo
Framework

Domain segmentation and adjustment for generalized zero-shot learning

Title Domain segmentation and adjustment for generalized zero-shot learning
Authors Xinsheng Wang, Shanmin Pang, Jihua Zhu
Abstract In the generalized zero-shot learning, synthesizing unseen data with generative models has been the most popular method to address the imbalance of training data between seen and unseen classes. However, this method requires that the unseen semantic information is available during the training stage, and training generative models is not trivial. Given that the generator of these models can only be trained with seen classes, we argue that synthesizing unseen data may not be an ideal approach for addressing the domain shift caused by the imbalance of the training data. In this paper, we propose to realize the generalized zero-shot recognition in different domains. Thus, unseen (seen) classes can avoid the effect of the seen (unseen) classes. In practice, we propose a threshold and probabilistic distribution joint method to segment the testing instances into seen, unseen and uncertain domains. Moreover, the uncertain domain is further adjusted to alleviate the domain shift. Extensive experiments on five benchmark datasets show that the proposed method exhibits competitive performance compared with that based on generative models.
Tasks Zero-Shot Learning
Published 2020-02-01
URL https://arxiv.org/abs/2002.00226v1
PDF https://arxiv.org/pdf/2002.00226v1.pdf
PWC https://paperswithcode.com/paper/domain-segmentation-and-adjustment-for
Repo
Framework

Zero-Shot Activity Recognition with Videos

Title Zero-Shot Activity Recognition with Videos
Authors Evin Pinar Ornek
Abstract In this paper, we examined the zero-shot activity recognition task with the usage of videos. We introduce an auto-encoder based model to construct a multimodal joint embedding space between the visual and textual manifolds. On the visual side, we used activity videos and a state-of-the-art 3D convolutional action recognition network to extract the features. On the textual side, we worked with GloVe word embeddings. The zero-shot recognition results are evaluated by top-n accuracy. Then, the manifold learning ability is measured by mean Nearest Neighbor Overlap. In the end, we provide an extensive discussion over the results and the future directions.
Tasks Activity Recognition, Word Embeddings, Zero-Shot Learning
Published 2020-01-22
URL https://arxiv.org/abs/2002.02265v1
PDF https://arxiv.org/pdf/2002.02265v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-activity-recognition-with-videos
Repo
Framework

Stacked Boosters Network Architecture for Short Term Load Forecasting in Buildings

Title Stacked Boosters Network Architecture for Short Term Load Forecasting in Buildings
Authors Tuukka Salmi, Jussi Kiljander, Daniel Pakkala
Abstract This paper presents a novel deep learning architecture for short term load forecasting of building energy loads. The architecture is based on a simple base learner and multiple boosting systems that are modelled as a single deep neural network. The architecture transforms the original multivariate time series into multiple cascading univariate time series. Together with sparse interactions, parameter sharing and equivariant representations, this approach makes it possible to combat against overfitting while still achieving good presentation power with a deep network architecture. The architecture is evaluated in several short-term load forecasting tasks with energy data from an office building in Finland. The proposed architecture outperforms state-of-the-art load forecasting model in all the tasks.
Tasks Load Forecasting, Time Series
Published 2020-01-23
URL https://arxiv.org/abs/2001.08406v1
PDF https://arxiv.org/pdf/2001.08406v1.pdf
PWC https://paperswithcode.com/paper/stacked-boosters-network-architecture-for
Repo
Framework

Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources

Title Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources
Authors Yidan Qin, Sahba Aghajani Pedram, Seyedshams Feyzabadi, Max Allan, A. Jonathan McLeod, Joel W. Burdick, Mahdi Azizian
Abstract Many tasks in robot-assisted surgeries (RAS) can be represented by finite-state machines (FSMs), where each state represents either an action (such as picking up a needle) or an observation (such as bleeding). A crucial step towards the automation of such surgical tasks is the temporal perception of the current surgical scene, which requires a real-time estimation of the states in the FSMs. The objective of this work is to estimate the current state of the surgical task based on the actions performed or events occurred as the task progresses. We propose Fusion-KVE, a unified surgical state estimation model that incorporates multiple data sources including the Kinematics, Vision, and system Events. Additionally, we examine the strengths and weaknesses of different state estimation models in segmenting states with different representative features or levels of granularity. We evaluate our model on the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), as well as a more complex dataset involving robotic intra-operative ultrasound (RIOUS) imaging, created using the da Vinci Xi surgical system. Our model achieves a superior frame-wise state estimation accuracy up to 89.4%, which improves the state-of-the-art surgical state estimation models in both JIGSAWS suturing dataset and our RIOUS dataset.
Tasks
Published 2020-02-07
URL https://arxiv.org/abs/2002.02921v1
PDF https://arxiv.org/pdf/2002.02921v1.pdf
PWC https://paperswithcode.com/paper/temporal-segmentation-of-surgical-sub-tasks
Repo
Framework

Dynamic Systems Simulation and Control Using Consecutive Recurrent Neural Networks

Title Dynamic Systems Simulation and Control Using Consecutive Recurrent Neural Networks
Authors Srikanth Chandar, Harsha Sunder
Abstract In this paper, we introduce a novel architecture to connecting adaptive learning and neural networks into an arbitrary machine’s control system paradigm. Two consecutive Recurrent Neural Networks (RNNs) are used together to accurately model the dynamic characteristics of electromechanical systems that include controllers, actuators and motors. The age-old method of achieving control with the use of the- Proportional, Integral and Derivative constants is well understood as a simplified method that does not capture the complexities of the inherent nonlinearities of complex control systems. In the context of controlling and simulating electromechanical systems, we propose an alternative to PID controllers, employing a sequence of two Recurrent Neural Networks. The first RNN emulates the behavior of the controller, and the second the actuator/motor. The second RNN when used in isolation, potentially serves as an advantageous alternative to extant testing methods of electromechanical systems.
Tasks
Published 2020-02-14
URL https://arxiv.org/abs/2002.10228v2
PDF https://arxiv.org/pdf/2002.10228v2.pdf
PWC https://paperswithcode.com/paper/dynamic-systems-simulation-and-control-using
Repo
Framework

Ensemble emotion recognizing with multiple modal physiological signals

Title Ensemble emotion recognizing with multiple modal physiological signals
Authors Jing Zhang, Yong Zhang, Suhua Zhan, Cheng Cheng
Abstract Physiological signals that provide the objective repression of human affective states are attracted increasing attention in the emotion recognition field. However, the single signal is difficult to obtain completely and accurately description for emotion. Multiple physiological signals fusing models, building the uniform classification model by means of consistent and complementary information from different emotions to improve recognition performance. Original fusing models usually choose the particular classification method to recognition, which is ignoring different distribution of multiple signals. Aiming above problems, in this work, we propose an emotion classification model through multiple modal physiological signals for different emotions. Features are extracted from EEG, EMG, EOG signals for characterizing emotional state on valence and arousal levels. For characterization, four bands filtering theta, beta, alpha, gamma for signal preprocessing are adopted and three Hjorth parameters are computing as features. To improve classification performance, an ensemble classifier is built. Experiments are conducted on the benchmark DEAP datasets. For the two-class task, the best result on arousal is 94.42%, the best result on valence is 94.02%, respectively. For the four-class task, the highest average classification accuracy is 90.74, and it shows good stability. The influence of different peripheral physiological signals for results is also analyzed in this paper.
Tasks EEG, Emotion Classification, Emotion Recognition
Published 2020-01-01
URL https://arxiv.org/abs/2001.00191v1
PDF https://arxiv.org/pdf/2001.00191v1.pdf
PWC https://paperswithcode.com/paper/ensemble-emotion-recognizing-with-multiple
Repo
Framework

Spatial-Scale Aligned Network for Fine-Grained Recognition

Title Spatial-Scale Aligned Network for Fine-Grained Recognition
Authors Lizhao Gao, Haihua Xu, Chong Sun, Junling Liu, Yu-Wing Tai
Abstract Existing approaches for fine-grained visual recognition focus on learning marginal region-based representations while neglecting the spatial and scale misalignments, leading to inferior performance. In this paper, we propose the spatial-scale aligned network (SSANET) and implicitly address misalignments during the recognition process. Especially, SSANET consists of 1) a self-supervised proposal mining formula with Morphological Alignment Constraints; 2) a discriminative scale mining (DSM) module, which exploits the feature pyramid via a circulant matrix, and provides the Fourier solver for fast scale alignments; 3) an oriented pooling (OP) module, that performs the pooling operation in several pre-defined orientations. Each orientation defines one kind of spatial alignment, and the network automatically determines which is the optimal alignments through learning. With the proposed two modules, our algorithm can automatically determine the accurate local proposal regions and generate more robust target representations being invariant to various appearance variances. Extensive experiments verify that SSANET is competent at learning better spatial-scale invariant target representations, yielding superior performance on the fine-grained recognition task on several benchmarks.
Tasks Fine-Grained Visual Recognition
Published 2020-01-05
URL https://arxiv.org/abs/2001.01211v1
PDF https://arxiv.org/pdf/2001.01211v1.pdf
PWC https://paperswithcode.com/paper/spatial-scale-aligned-network-for-fine
Repo
Framework

2018 Robotic Scene Segmentation Challenge

Title 2018 Robotic Scene Segmentation Challenge
Authors Max Allan, Satoshi Kondo, Sebastian Bodenstedt, Stefan Leger, Rahim Kadkhodamohammadi, Imanol Luengo, Felix Fuentes, Evangello Flouty, Ahmed Mohammed, Marius Pedersen, Avinash Kori, Varghese Alex, Ganapathy Krishnamurthi, David Rauber, Robert Mendel, Christoph Palm, Sophia Bano, Guinther Saibro, Chi-Sheng Shih, Hsun-An Chiang, Juntang Zhuang, Junlin Yang, Vladimir Iglovikov, Anton Dobrenkii, Madhu Reddiboina, Anubhav Reddy, Xingtong Liu, Cong Gao, Mathias Unberath, Myeonghyeon Kim, Chanho Kim, Chaewon Kim, Hyejin Kim, Gyeongmin Lee, Ihsan Ullah, Miguel Luna, Sang Hyun Park, Mahdi Azizian, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel
Abstract In 2015 we began a sub-challenge at the EndoVis workshop at MICCAI in Munich using endoscope images of ex-vivo tissue with automatically generated annotations from robot forward kinematics and instrument CAD models. However, the limited background variation and simple motion rendered the dataset uninformative in learning about which techniques would be suitable for segmentation in real surgery. In 2017, at the same workshop in Quebec we introduced the robotic instrument segmentation dataset with 10 teams participating in the challenge to perform binary, articulating parts and type segmentation of da Vinci instruments. This challenge included realistic instrument motion and more complex porcine tissue as background and was widely addressed with modifications on U-Nets and other popular CNN architectures. In 2018 we added to the complexity by introducing a set of anatomical objects and medical devices to the segmented classes. To avoid over-complicating the challenge, we continued with porcine data which is dramatically simpler than human tissue due to the lack of fatty tissue occluding many organs.
Tasks Scene Segmentation
Published 2020-01-30
URL https://arxiv.org/abs/2001.11190v2
PDF https://arxiv.org/pdf/2001.11190v2.pdf
PWC https://paperswithcode.com/paper/2018-robotic-scene-segmentation-challenge
Repo
Framework

When Person Re-identification Meets Changing Clothes

Title When Person Re-identification Meets Changing Clothes
Authors Fangbin Wan, Yang Wu, Xuelin Qian, Yanwei Fu
Abstract Person re-identification (Reid) is now an active research topic for AI-based video surveillance applications such as specific person search, but the practical issue that the target person(s) may change clothes (clothes inconsistency problem) has been overlooked for long. For the first time, this paper systematically studies this problem. We first overcome the difficulty of lack of suitable dataset, by collecting a small yet representative real dataset for testing whilst building a large realistic synthetic dataset for training and deeper studies. Facilitated by our new datasets, we are able to conduct various interesting new experiments for studying the influence of clothes inconsistency. We find that changing clothes makes Reid a much harder problem in the sense of bringing difficulties to learning effective representations and also challenges the generalization ability of previous Reid models to identify persons with unseen (new) clothes. Representative existing Reid models are adopted to show informative results on such a challenging setting, and we also provide some preliminary efforts on improving the robustness of existing models on handling the clothes inconsistency issue in the data. We believe that this study can be inspiring and helpful for encouraging more researches in this direction. The dataset is available on the project website: https://wanfb.github.io/dataset.html
Tasks Person Re-Identification, Person Search
Published 2020-03-09
URL https://arxiv.org/abs/2003.04070v2
PDF https://arxiv.org/pdf/2003.04070v2.pdf
PWC https://paperswithcode.com/paper/when-person-re-identification-meets-changing
Repo
Framework

Table-Top Scene Analysis Using Knowledge-Supervised MCMC

Title Table-Top Scene Analysis Using Knowledge-Supervised MCMC
Authors Ziyuan Liu, Dong Chen, Kai M. Wurm, Georg von Wichert
Abstract In this paper, we propose a probabilistic method to generate abstract scene graphs for table-top scenes from 6D object pose estimates. We explicitly make use of task-specfic context knowledge by encoding this knowledge as descriptive rules in Markov logic networks. Our approach to generate scene graphs is probabilistic: Uncertainty in the object poses is addressed by a probabilistic sensor model that is embedded in a data driven MCMC process. We apply Markov logic inference to reason about hidden objects and to detect false estimates of object poses. The effectiveness of our approach is demonstrated and evaluated in real world experiments.
Tasks
Published 2020-02-19
URL https://arxiv.org/abs/2002.08417v1
PDF https://arxiv.org/pdf/2002.08417v1.pdf
PWC https://paperswithcode.com/paper/table-top-scene-analysis-using-knowledge
Repo
Framework

Forecasting Foreign Exchange Rate: A Multivariate Comparative Analysis between Traditional Econometric, Contemporary Machine Learning & Deep Learning Techniques

Title Forecasting Foreign Exchange Rate: A Multivariate Comparative Analysis between Traditional Econometric, Contemporary Machine Learning & Deep Learning Techniques
Authors Manav Kaushik, A K Giri
Abstract In todays global economy, accuracy in predicting macro-economic parameters such as the foreign the exchange rate or at least estimating the trend correctly is of key importance for any future investment. In recent times, the use of computational intelligence-based techniques for forecasting macroeconomic variables has been proven highly successful. This paper tries to come up with a multivariate time series approach to forecast the exchange rate (USD/INR) while parallelly comparing the performance of three multivariate prediction modelling techniques: Vector Auto Regression (a Traditional Econometric Technique), Support Vector Machine (a Contemporary Machine Learning Technique), and Recurrent Neural Networks (a Contemporary Deep Learning Technique). We have used monthly historical data for several macroeconomic variables from April 1994 to December 2018 for USA and India to predict USD-INR Foreign Exchange Rate. The results clearly depict that contemporary techniques of SVM and RNN (Long Short-Term Memory) outperform the widely used traditional method of Auto Regression. The RNN model with Long Short-Term Memory (LSTM) provides the maximum accuracy (97.83%) followed by SVM Model (97.17%) and VAR Model (96.31%). At last, we present a brief analysis of the correlation and interdependencies of the variables used for forecasting.
Tasks Time Series
Published 2020-02-19
URL https://arxiv.org/abs/2002.10247v1
PDF https://arxiv.org/pdf/2002.10247v1.pdf
PWC https://paperswithcode.com/paper/forecasting-foreign-exchange-rate-a
Repo
Framework

A Unified Convergence Analysis for Shuffling-Type Gradient Methods

Title A Unified Convergence Analysis for Shuffling-Type Gradient Methods
Authors Lam M. Nguyen, Quoc Tran-Dinh, Dzung T. Phan, Phuong Ha Nguyen, Marten van Dijk
Abstract In this paper, we provide a unified convergence analysis for a class of shuffling-type gradient methods for solving a well-known finite-sum minimization problem commonly used in machine learning. This algorithm covers various variants such as randomized reshuffling, single shuffling, and cyclic/incremental gradient schemes. We consider two different settings: strongly convex and non-convex problems. Our main contribution consists of new non-asymptotic and asymptotic convergence rates for a general class of shuffling-type gradient methods to solve both non-convex and strongly convex problems. While our rate in the non-convex problem is new (i.e. not known yet under standard assumptions), the rate on the strongly convex case matches (up to a constant) the best-known results. However, unlike existing works in this direction, we only use standard assumptions such as smoothness and strong convexity. Finally, we empirically illustrate the effect of learning rates via a non-convex logistic regression and neural network examples.
Tasks
Published 2020-02-19
URL https://arxiv.org/abs/2002.08246v1
PDF https://arxiv.org/pdf/2002.08246v1.pdf
PWC https://paperswithcode.com/paper/a-unified-convergence-analysis-for-shuffling
Repo
Framework

Bone Structures Extraction and Enhancement in Chest Radiographs via CNN Trained on Synthetic Data

Title Bone Structures Extraction and Enhancement in Chest Radiographs via CNN Trained on Synthetic Data
Authors Ophir Gozes, Hayit Greenspan
Abstract In this paper, we present a deep learning-based image processing technique for extraction of bone structures in chest radiographs using a U-Net FCNN. The U-Net was trained to accomplish the task in a fully supervised setting. To create the training image pairs, we employed simulated X-Ray or Digitally Reconstructed Radiographs (DRR), derived from 664 CT scans belonging to the LIDC-IDRI dataset. Using HU based segmentation of bone structures in the CT domain, a synthetic 2D “Bone x-ray” DRR is produced and used for training the network. For the reconstruction loss, we utilize two loss functions- L1 Loss and perceptual loss. Once the bone structures are extracted, the original image can be enhanced by fusing the original input x-ray and the synthesized “Bone X-ray”. We show that our enhancement technique is applicable to real x-ray data, and display our results on the NIH Chest X-Ray-14 dataset.
Tasks
Published 2020-03-20
URL https://arxiv.org/abs/2003.10839v1
PDF https://arxiv.org/pdf/2003.10839v1.pdf
PWC https://paperswithcode.com/paper/bone-structures-extraction-and-enhancement-in
Repo
Framework
comments powered by Disqus