Paper Group ANR 512
Relational State-Space Model for Stochastic Multi-Object Systems. Knowledge Representations in Technical Systems – A Taxonomy. Understanding Crowd Flow Movements Using Active-Langevin Model. Affinity guided Geometric Semi-Supervised Metric Learning. Advances in Collaborative Filtering and Ranking. Recognizing Handwritten Mathematical Expressions a …
Relational State-Space Model for Stochastic Multi-Object Systems
Title | Relational State-Space Model for Stochastic Multi-Object Systems |
Authors | Fan Yang, Ling Chen, Fan Zhou, Yusong Gao, Wei Cao |
Abstract | Real-world dynamical systems often consist of multiple stochastic subsystems that interact with each other. Modeling and forecasting the behavior of such dynamics are generally not easy, due to the inherent hardness in understanding the complicated interactions and evolutions of their constituents. This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model that makes use of graph neural networks (GNNs) to simulate the joint state transitions of multiple correlated objects. By letting GNNs cooperate with SSM, R-SSM provides a flexible way to incorporate relational information into the modeling of multi-object dynamics. We further suggest augmenting the model with normalizing flows instantiated for vertex-indexed random variables and propose two auxiliary contrastive objectives to facilitate the learning. The utility of R-SSM is empirically evaluated on synthetic and real time-series datasets. |
Tasks | Time Series |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.04050v1 |
https://arxiv.org/pdf/2001.04050v1.pdf | |
PWC | https://paperswithcode.com/paper/relational-state-space-model-for-stochastic-1 |
Repo | |
Framework | |
Knowledge Representations in Technical Systems – A Taxonomy
Title | Knowledge Representations in Technical Systems – A Taxonomy |
Authors | Kristina Scharei, Florian Heidecker, Maarten Bieshaar |
Abstract | The recent usage of technical systems in human-centric environments leads to the question, how to teach technical systems, e.g., robots, to understand, learn, and perform tasks desired by the human. Therefore, an accurate representation of knowledge is essential for the system to work as expected. This article mainly gives insight into different knowledge representation techniques and their categorization into various problem domains in artificial intelligence. Additionally, applications of presented knowledge representations are introduced in everyday robotics tasks. By means of the provided taxonomy, the search for a proper knowledge representation technique regarding a specific problem should be facilitated. |
Tasks | |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.04835v2 |
https://arxiv.org/pdf/2001.04835v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-representations-in-technical |
Repo | |
Framework | |
Understanding Crowd Flow Movements Using Active-Langevin Model
Title | Understanding Crowd Flow Movements Using Active-Langevin Model |
Authors | Shreetam Behera, Debi Prosad Dogra, Malay Kumar Bandyopadhyay, Partha Pratim Roy |
Abstract | Crowd flow describes the elementary group behavior of crowds. Understanding the dynamics behind these movements can help to identify various abnormalities in crowds. However, developing a crowd model describing these flows is a challenging task. In this paper, a physics-based model is proposed to describe the movements in dense crowds. The crowd model is based on active Langevin equation where the motion points are assumed to be similar to active colloidal particles in fluids. The model is further augmented with computer-vision techniques to segment both linear and non-linear motion flows in a dense crowd. The evaluation of the active Langevin equation-based crowd segmentation has been done on publicly available crowd videos and on our own videos. The proposed method is able to segment the flow with lesser optical flow error and better accuracy in comparison to existing state-of-the-art methods. |
Tasks | Optical Flow Estimation |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.05626v2 |
https://arxiv.org/pdf/2003.05626v2.pdf | |
PWC | https://paperswithcode.com/paper/understanding-crowd-flow-movements-using |
Repo | |
Framework | |
Affinity guided Geometric Semi-Supervised Metric Learning
Title | Affinity guided Geometric Semi-Supervised Metric Learning |
Authors | Ujjal Kr Dutta, Mehrtash Harandi, Chellu Chandra Sekhar |
Abstract | In this paper, we address the semi-supervised metric learning problem, where we learn a distance metric using very few labeled examples, and additionally available unlabeled data. To address the limitations of existing semi-supervised approaches, we integrate some of the best practices across metric learning, to achieve the state-of-the-art in the semi-supervised setting. In particular, we make use of a graph-based approach to propagate the affinities or similarities among the limited labeled pairs to the unlabeled data. Considering the neighborhood of an example, we take into account the propagated affinities to mine triplet constraints. An angular loss is imposed on these triplets to learn a metric. Additionally, we impose orthogonality on the parameters of the learned embedding to avoid a model collapse. In contrast to existing approaches, we propose a stochastic approach that scales well to large-scale datasets. We outperform various semi-supervised metric learning approaches on a number of benchmark datasets. |
Tasks | Metric Learning |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12394v1 |
https://arxiv.org/pdf/2002.12394v1.pdf | |
PWC | https://paperswithcode.com/paper/affinity-guided-geometric-semi-supervised |
Repo | |
Framework | |
Advances in Collaborative Filtering and Ranking
Title | Advances in Collaborative Filtering and Ranking |
Authors | Liwei Wu |
Abstract | In this dissertation, we cover some recent advances in collaborative filtering and ranking. In chapter 1, we give a brief introduction of the history and the current landscape of collaborative filtering and ranking; chapter 2 we first talk about pointwise collaborative filtering problem with graph information, and how our proposed new method can encode very deep graph information which helps four existing graph collaborative filtering algorithms; chapter 3 is on the pairwise approach for collaborative ranking and how we speed up the algorithm to near-linear time complexity; chapter 4 is on the new listwise approach for collaborative ranking and how the listwise approach is a better choice of loss for both explicit and implicit feedback over pointwise and pairwise loss; chapter 5 is about the new regularization technique Stochastic Shared Embeddings (SSE) we proposed for embedding layers and how it is both theoretically sound and empirically effectively for 6 different tasks across recommendation and natural language processing; chapter 6 is how we introduce personalization for the state-of-the-art sequential recommendation model with the help of SSE, which plays an important role in preventing our personalized model from overfitting to the training data; chapter 7, we summarize what we have achieved so far and predict what the future directions can be; chapter 8 is the appendix to all the chapters. |
Tasks | Collaborative Ranking |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12312v1 |
https://arxiv.org/pdf/2002.12312v1.pdf | |
PWC | https://paperswithcode.com/paper/advances-in-collaborative-filtering-and |
Repo | |
Framework | |
Recognizing Handwritten Mathematical Expressions as LaTex Sequences Using a Multiscale Robust Neural Network
Title | Recognizing Handwritten Mathematical Expressions as LaTex Sequences Using a Multiscale Robust Neural Network |
Authors | Hongyu Wang, Guangcun Shan |
Abstract | In this paper, a robust multiscale neural network is proposed to recognize handwritten mathematical expressions and output LaTeX sequences, which can effectively and correctly focus on where each step of output should be concerned and has a positive effect on analyzing the two-dimensional structure of handwritten mathematical expressions and identifying different mathematical symbols in a long expression. With the addition of visualization, the model’s recognition process is shown in detail. In addition, our model achieved 49.459% and 46.062% ExpRate on the public CROHME 2014 and CROHME 2016 datasets. The present model results suggest that the state-of-the-art model has better robustness, fewer errors, and higher accuracy. |
Tasks | |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2003.00817v1 |
https://arxiv.org/pdf/2003.00817v1.pdf | |
PWC | https://paperswithcode.com/paper/recognizing-handwritten-mathematical |
Repo | |
Framework | |
How Far are We from Effective Context Modeling ? An Exploratory Study on Semantic Parsing in Context
Title | How Far are We from Effective Context Modeling ? An Exploratory Study on Semantic Parsing in Context |
Authors | Qian Liu, Bei Chen, Jiaqi Guo, Jian-Guang Lou, Bin Zhou, Dongmei Zhang |
Abstract | Recently semantic parsing in context has received a considerable attention, which is challenging since there are complex contextual phenomena. Previous works verified their proposed methods in limited scenarios, which motivates us to conduct an exploratory study on context modeling methods under real-world semantic parsing in context. We present a grammar-based decoding semantic parser and adapt typical context modeling methods on top of it. We evaluate 13 context modeling methods on two large complex cross-domain datasets, and our best model achieves state-of-the-art performances on both datasets with significant improvements. Furthermore, we summarize the most frequent contextual phenomena, with a fine-grained analysis on representative models, which may shed light on potential research directions. |
Tasks | Semantic Parsing |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00652v1 |
https://arxiv.org/pdf/2002.00652v1.pdf | |
PWC | https://paperswithcode.com/paper/how-far-are-we-from-effective-context |
Repo | |
Framework | |
Cost-effective search for lower-error region in material parameter space using multifidelity Gaussian process modeling
Title | Cost-effective search for lower-error region in material parameter space using multifidelity Gaussian process modeling |
Authors | Shion Takeno, Yuhki Tsukada, Hitoshi Fukuoka, Toshiyuki Koyama, Motoki Shiga, Masayuki Karasuyama |
Abstract | Information regarding precipitate shapes is critical for estimating material parameters. Hence, we considered estimating a region of material parameter space in which a computational model produces precipitates having shapes similar to those observed in the experimental images. This region, called the lower-error region (LER), reflects intrinsic information of the material contained in the precipitate shapes. However, the computational cost of LER estimation can be high because the accurate computation of the model is required many times to better explore parameters. To overcome this difficulty, we used a Gaussian-process-based multifidelity modeling, in which training data can be sampled from multiple computations with different accuracy levels (fidelity). Lower-fidelity samples may have lower accuracy, but the computational cost is lower than that for higher-fidelity samples. Our proposed sampling procedure iteratively determines the most cost-effective pair of a point and a fidelity level for enhancing the accuracy of LER estimation. We demonstrated the efficiency of our method through estimation of the interface energy and lattice mismatch between MgZn2 and {\alpha}-Mg phases in an Mg-based alloy. The results showed that the sampling cost required to obtain accurate LER estimation could be drastically reduced. |
Tasks | |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.13428v1 |
https://arxiv.org/pdf/2003.13428v1.pdf | |
PWC | https://paperswithcode.com/paper/cost-effective-search-for-lower-error-region |
Repo | |
Framework | |
Don’t Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing
Title | Don’t Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing |
Authors | Subendhu Rongali, Luca Soldaini, Emilio Monti, Wael Hamza |
Abstract | Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users. Traditionally, rule-based or statistical slot-filling systems have been used to parse “simple” queries; that is, queries that contain a single action and can be decomposed into a set of non-overlapping entities. More recently, shift-reduce parsers have been proposed to process more complex utterances. These methods, while powerful, impose specific limitations on the type of queries that can be parsed; namely, they require a query to be representable as a parse tree. In this work, we propose a unified architecture based on Sequence to Sequence models and Pointer Generator Network to handle both simple and complex queries. Unlike other works, our approach does not impose any restriction on the semantic parse schema. Furthermore, experiments show that it achieves state of the art performance on three publicly available datasets (ATIS, SNIPS, Facebook TOP), relatively improving between 3.3% and 7.7% in exact match accuracy over previous systems. Finally, we show the effectiveness of our approach on two internal datasets. |
Tasks | Semantic Parsing, Slot Filling |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11458v1 |
https://arxiv.org/pdf/2001.11458v1.pdf | |
PWC | https://paperswithcode.com/paper/dont-parse-generate-a-sequence-to-sequence |
Repo | |
Framework | |
Data Science in Economics
Title | Data Science in Economics |
Authors | Saeed Nosratabadi, Amir Mosavi, Puhong Duan, Pedram Ghamisi |
Abstract | This paper provides the state of the art of data science in economics. Through a novel taxonomy of applications and methods advances in data science are investigated. The data science advances are investigated in three individual classes of deep learning models, ensemble models, and hybrid models. Application domains include stock market, marketing, E-commerce, corporate banking, and cryptocurrency. Prisma method, a systematic literature review methodology is used to ensure the quality of the survey. The findings revealed that the trends are on advancement of hybrid models as more than 51% of the reviewed articles applied hybrid model. On the other hand, it is found that based on the RMSE accuracy metric, hybrid models had higher prediction accuracy than other algorithms. While it is expected the trends go toward the advancements of deep learning models. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.13422v1 |
https://arxiv.org/pdf/2003.13422v1.pdf | |
PWC | https://paperswithcode.com/paper/data-science-in-economics |
Repo | |
Framework | |
Can Giraffes Become Birds? An Evaluation of Image-to-image Translation for Data Generation
Title | Can Giraffes Become Birds? An Evaluation of Image-to-image Translation for Data Generation |
Authors | Daniel V. Ruiz, Gabriel Salomon, Eduardo Todt |
Abstract | There is an increasing interest in image-to-image translation with applications ranging from generating maps from satellite images to creating entire clothes’ images from only contours. In the present work, we investigate image-to-image translation using Generative Adversarial Networks (GANs) for generating new data, taking as a case study the morphing of giraffes images into bird images. Morphing a giraffe into a bird is a challenging task, as they have different scales, textures, and morphology. An unsupervised cross-domain translator entitled InstaGAN was trained on giraffes and birds, along with their respective masks, to learn translation between both domains. A dataset of synthetic bird images was generated using translation from originally giraffe images while preserving the original spatial arrangement and background. It is important to stress that the generated birds do not exist, being only the result of a latent representation learned by InstaGAN. Two subsets of common literature datasets were used for training the GAN and generating the translated images: COCO and Caltech-UCSD Birds 200-2011. To evaluate the realness and quality of the generated images and masks, qualitative and quantitative analyses were made. For the quantitative analysis, a pre-trained Mask R-CNN was used for the detection and segmentation of birds on Pascal VOC, Caltech-UCSD Birds 200-2011, and our new dataset entitled FakeSet. The generated dataset achieved detection and segmentation results close to the real datasets, suggesting that the generated images are realistic enough to be detected and segmented by a state-of-the-art deep neural network. |
Tasks | Image-to-Image Translation |
Published | 2020-01-10 |
URL | https://arxiv.org/abs/2001.03637v1 |
https://arxiv.org/pdf/2001.03637v1.pdf | |
PWC | https://paperswithcode.com/paper/can-giraffes-become-birds-an-evaluation-of |
Repo | |
Framework | |
Distinguishing Cell Phenotype Using Cell Epigenotype
Title | Distinguishing Cell Phenotype Using Cell Epigenotype |
Authors | Thomas P. Wytock, Adilson E. Motter |
Abstract | The relationship between microscopic observations and macroscopic behavior is a fundamental open question in biophysical systems. Here, we develop a unified approach that—in contrast with existing methods—predicts cell type from macromolecular data even when accounting for the scale of human tissue diversity and limitations in the available data. We achieve these benefits by applying a k-nearest-neighbors algorithm after projecting our data onto the eigenvectors of the correlation matrix inferred from many observations of gene expression or chromatin conformation. Our approach identifies variations in epigenotype that impact cell type, thereby supporting the cell type attractor hypothesis and representing the first step toward model-independent control strategies in biological systems. |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09432v1 |
https://arxiv.org/pdf/2003.09432v1.pdf | |
PWC | https://paperswithcode.com/paper/distinguishing-cell-phenotype-using-cell |
Repo | |
Framework | |
Minimum-Norm Adversarial Examples on KNN and KNN-Based Models
Title | Minimum-Norm Adversarial Examples on KNN and KNN-Based Models |
Authors | Chawin Sitawarin, David Wagner |
Abstract | We study the robustness against adversarial examples of kNN classifiers and classifiers that combine kNN with neural networks. The main difficulty lies in the fact that finding an optimal attack on kNN is intractable for typical datasets. In this work, we propose a gradient-based attack on kNN and kNN-based defenses, inspired by the previous work by Sitawarin & Wagner [1]. We demonstrate that our attack outperforms their method on all of the models we tested with only a minimal increase in the computation time. The attack also beats the state-of-the-art attack [2] on kNN when k > 1 using less than 1% of its running time. We hope that this attack can be used as a new baseline for evaluating the robustness of kNN and its variants. |
Tasks | |
Published | 2020-03-14 |
URL | https://arxiv.org/abs/2003.06559v1 |
https://arxiv.org/pdf/2003.06559v1.pdf | |
PWC | https://paperswithcode.com/paper/minimum-norm-adversarial-examples-on-knn-and |
Repo | |
Framework | |
Do As I Do: Transferring Human Motion and Appearance between Monocular Videos with Spatial and Temporal Constraints
Title | Do As I Do: Transferring Human Motion and Appearance between Monocular Videos with Spatial and Temporal Constraints |
Authors | Thiago L. Gomes, Renato Martins, João Ferreira, Erickson R. Nascimento |
Abstract | Creating plausible virtual actors from images of real actors remains one of the key challenges in computer vision and computer graphics. Marker-less human motion estimation and shape modeling from images in the wild bring this challenge to the fore. Although the recent advances on view synthesis and image-to-image translation, currently available formulations are limited to transfer solely style and do not take into account the character’s motion and shape, which are by nature intermingled to produce plausible human forms. In this paper, we propose a unifying formulation for transferring appearance and retargeting human motion from monocular videos that regards all these aspects. Our method synthesizes new videos of people in a different context where they were initially recorded. Differently from recent appearance transferring methods, our approach takes into account body shape, appearance, and motion constraints. The evaluation is performed with several experiments using publicly available real videos containing hard conditions. Our method is able to transfer both human motion and appearance outperforming state-of-the-art methods, while preserving specific features of the motion that must be maintained (e.g., feet touching the floor, hands touching a particular object) and holding the best visual quality and appearance metrics such as Structural Similarity (SSIM) and Learned Perceptual Image Patch Similarity (LPIPS). |
Tasks | Image-to-Image Translation, Motion Estimation |
Published | 2020-01-08 |
URL | https://arxiv.org/abs/2001.02606v2 |
https://arxiv.org/pdf/2001.02606v2.pdf | |
PWC | https://paperswithcode.com/paper/do-as-i-do-transferring-human-motion-and |
Repo | |
Framework | |
Informative Sample Mining Network for Multi-Domain Image-to-Image Translation
Title | Informative Sample Mining Network for Multi-Domain Image-to-Image Translation |
Authors | Jie Cao, Huaibo Huang, Yi Li, Ran He, Zhenan Sun |
Abstract | The performance of multi-domain image-to-image translation has been significantly improved by recent progress in deep generative models. Existing approaches can use a unified model to achieve translations between all the visual domains. However, their outcomes are far from satisfying when there are large domain variations. In this paper, we reveal that improving the sample selection strategy is an effective solution. To select informative samples, we dynamically estimate sample importance during the training of Generative Adversarial Networks, presenting Informative Sample Mining Network. We theoretically analyze the relationship between the sample importance and the prediction of the global optimal discriminator. Then a practical importance estimation function based on general discriminators is derived. In addition, we propose a novel multi-stage sample training scheme to reduce sample hardness while preserving sample informativeness. Extensive experiments on a wide range of specific image-to-image translation tasks are conducted, and the results demonstrate our superiority over current state-of-the-art methods. |
Tasks | Image-to-Image Translation |
Published | 2020-01-05 |
URL | https://arxiv.org/abs/2001.01173v3 |
https://arxiv.org/pdf/2001.01173v3.pdf | |
PWC | https://paperswithcode.com/paper/informative-sample-mining-network-for-multi |
Repo | |
Framework | |