Paper Group ANR 78
Natural representation of composite data with replicated autoencoders. A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages. End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds. Adaptive and Azimuth-Aware Fusion Network of Multimodal Local Features for 3D Object Detection. …
Natural representation of composite data with replicated autoencoders
Title | Natural representation of composite data with replicated autoencoders |
Authors | Matteo Negri, Davide Bergamini, Carlo Baldassi, Riccardo Zecchina, Christoph Feinauer |
Abstract | Generative processes in biology and other fields often produce data that can be regarded as resulting from a composition of basic features. Here we present an unsupervised method based on autoencoders for inferring these basic features of data. The main novelty in our approach is that the training is based on the optimization of the `local entropy’ rather than the standard loss, resulting in a more robust inference, and enhancing the performance on this type of data considerably. Algorithmically, this is realized by training an interacting system of replicated autoencoders. We apply this method to synthetic and protein sequence data, and show that it is able to infer a hidden representation that correlates well with the underlying generative process, without requiring any prior knowledge. | |
Tasks | |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13327v1 |
https://arxiv.org/pdf/1909.13327v1.pdf | |
PWC | https://paperswithcode.com/paper/natural-representation-of-composite-data-with |
Repo | |
Framework | |
A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages
Title | A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages |
Authors | Clara Vania, Yova Kementchedjhieva, Anders Søgaard, Adam Lopez |
Abstract | Parsers are available for only a handful of the world’s languages, since they require lots of training data. How far can we get with just a small amount of training data? We systematically compare a set of simple strategies for improving low-resource parsers: data augmentation, which has not been tested before; cross-lingual training; and transliteration. Experimenting on three typologically diverse low-resource languages—North S'ami, Galician, and Kazah—We find that (1) when only the low-resource treebank is available, data augmentation is very helpful; (2) when a related high-resource treebank is available, cross-lingual training is helpful and complements data augmentation; and (3) when the high-resource treebank uses a different writing system, transliteration into a shared orthographic spaces is also very helpful. |
Tasks | Data Augmentation, Dependency Parsing, Transliteration |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02857v1 |
https://arxiv.org/pdf/1909.02857v1.pdf | |
PWC | https://paperswithcode.com/paper/a-systematic-comparison-of-methods-for-low |
Repo | |
Framework | |
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
Title | End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds |
Authors | Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan |
Abstract | Recent work on 3D object detection advocates point cloud voxelization in birds-eye view, where objects preserve their physical dimensions and are naturally separable. When represented in this view, however, point clouds are sparse and have highly variable point density, which may cause detectors difficulties in detecting distant or small objects (pedestrians, traffic signs, etc.). On the other hand, perspective view provides dense observations, which could allow more favorable feature encoding for such cases. In this paper, we aim to synergize the birds-eye view and the perspective view and propose a novel end-to-end multi-view fusion (MVF) algorithm, which can effectively learn to utilize the complementary information from both. Specifically, we introduce dynamic voxelization, which has four merits compared to existing voxelization methods, i) removing the need of pre-allocating a tensor with fixed size; ii) overcoming the information loss due to stochastic point/voxel dropout; iii) yielding deterministic voxel embeddings and more stable detection outcomes; iv) establishing the bi-directional relationship between points and voxels, which potentially lays a natural foundation for cross-view feature fusion. By employing dynamic voxelization, the proposed feature fusion architecture enables each point to learn to fuse context information from different views. MVF operates on points and can be naturally extended to other approaches using LiDAR point clouds. We evaluate our MVF model extensively on the newly released Waymo Open Dataset and on the KITTI dataset and demonstrate that it significantly improves detection accuracy over the comparable single-view PointPillars baseline. |
Tasks | 3D Object Detection, Object Detection |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06528v2 |
https://arxiv.org/pdf/1910.06528v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-multi-view-fusion-for-3d-object |
Repo | |
Framework | |
Adaptive and Azimuth-Aware Fusion Network of Multimodal Local Features for 3D Object Detection
Title | Adaptive and Azimuth-Aware Fusion Network of Multimodal Local Features for 3D Object Detection |
Authors | Yonglin Tian, Kunfeng Wang, Yuang Wang, Yulin Tian, Zilei Wang, Fei-Yue Wang |
Abstract | This paper focuses on the construction of stronger local features and the effective fusion of image and LiDAR data. We adopt different modalities of LiDAR data to generate richer features and present an adaptive and azimuth-aware network to aggregate local features from image, bird’s eye view maps and point cloud. Our network mainly consists of three subnetworks: ground plane estimation network, region proposal network and adaptive fusion network. The ground plane estimation network extracts features of point cloud and predicts the parameters of a plane which are used for generating abundant 3D anchors. The region proposal network generates features of image and bird’s eye view maps to output region proposals. To integrate heterogeneous image and point cloud features, the adaptive fusion network explicitly adjusts the intensity of multiple local features and achieves the orientation consistency between image and LiDAR data by introduce an azimuth-aware fusion module. Experiments are conducted on KITTI dataset and the results validate the advantages of our aggregation of multimodal local features and the adaptive fusion network. |
Tasks | 3D Object Detection, Object Detection |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04392v1 |
https://arxiv.org/pdf/1910.04392v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-and-azimuth-aware-fusion-network-of |
Repo | |
Framework | |
Distributed physics informed neural network for data-efficient solution to partial differential equations
Title | Distributed physics informed neural network for data-efficient solution to partial differential equations |
Authors | Vikas Dwivedi, Nishant Parashar, Balaji Srinivasan |
Abstract | The physics informed neural network (PINN) is evolving as a viable method to solve partial differential equations. In the recent past PINNs have been successfully tested and validated to find solutions to both linear and non-linear partial differential equations (PDEs). However, the literature lacks detailed investigation of PINNs in terms of their representation capability. In this work, we first test the original PINN method in terms of its capability to represent a complicated function. Further, to address the shortcomings of the PINN architecture, we propose a novel distributed PINN, named DPINN. We first perform a direct comparison of the proposed DPINN approach against PINN to solve a non-linear PDE (Burgers’ equation). We show that DPINN not only yields a more accurate solution to the Burgers’ equation, but it is found to be more data-efficient as well. At last, we employ our novel DPINN to two-dimensional steady-state Navier-Stokes equation, which is a system of non-linear PDEs. To the best of the authors’ knowledge, this is the first such attempt to directly solve the Navier-Stokes equation using a physics informed neural network. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.08967v1 |
https://arxiv.org/pdf/1907.08967v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-physics-informed-neural-network |
Repo | |
Framework | |
Imbalanced Sentiment Classification Enhanced with Discourse Marker
Title | Imbalanced Sentiment Classification Enhanced with Discourse Marker |
Authors | Tao Zhang, Xing Wu, Meng Lin, Jizhong Han, Songlin Hu |
Abstract | Imbalanced data commonly exists in real world, espacially in sentiment-related corpus, making it difficult to train a classifier to distinguish latent sentiment in text data. We observe that humans often express transitional emotion between two adjacent discourses with discourse markers like “but”, “though”, “while”, etc, and the head discourse and the tail discourse 3 usually indicate opposite emotional tendencies. Based on this observation, we propose a novel plug-and-play method, which first samples discourses according to transitional discourse markers and then validates sentimental polarities with the help of a pretrained attention-based model. Our method increases sample diversity in the first place, can serve as a upstream preprocessing part in data augmentation. We conduct experiments on three public sentiment datasets, with several frequently used algorithms. Results show that our method is found to be consistently effective, even in highly imbalanced scenario, and easily be integrated with oversampling method to boost the performance on imbalanced sentiment classification. |
Tasks | Data Augmentation, Sentiment Analysis |
Published | 2019-03-28 |
URL | http://arxiv.org/abs/1903.11919v1 |
http://arxiv.org/pdf/1903.11919v1.pdf | |
PWC | https://paperswithcode.com/paper/imbalanced-sentiment-classification-enhanced |
Repo | |
Framework | |
Symbol Emergence as an Interpersonal Multimodal Categorization
Title | Symbol Emergence as an Interpersonal Multimodal Categorization |
Authors | Yoshinobu Hagiwara, Hiroyoshi Kobayashi, Akira Taniguchi, Tadahiro Taniguchi |
Abstract | This study focuses on category formation for individual agents and the dynamics of symbol emergence in a multi-agent system through semiotic communication. Semiotic communication is defined, in this study, as the generation and interpretation of signs associated with the categories formed through the agent’s own sensory experience or by exchange of signs with other agents. From the viewpoint of language evolution and symbol emergence, organization of a symbol system in a multi-agent system is considered as a bottom-up and dynamic process, where individual agents share the meaning of signs and categorize sensory experience. A constructive computational model can explain the mutual dependency of the two processes and has mathematical support that guarantees a symbol system’s emergence and sharing within the multi-agent system. In this paper, we describe a new computational model that represents symbol emergence in a two-agent system based on a probabilistic generative model for multimodal categorization. It models semiotic communication via a probabilistic rejection based on the receiver’s own belief. We have found that the dynamics by which cognitively independent agents create a symbol system through their semiotic communication can be regarded as the inference process of a hidden variable in an interpersonal multimodal categorizer, if we define the rejection probability based on the Metropolis-Hastings algorithm. The validity of the proposed model and algorithm for symbol emergence is also verified in an experiment with two agents observing daily objects in the real-world environment. The experimental results demonstrate that our model reproduces the phenomena of symbol emergence, which does not require a teacher who would know a pre-existing symbol system. Instead, the multi-agent system can form and use a symbol system without having pre-existing categories. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13443v1 |
https://arxiv.org/pdf/1905.13443v1.pdf | |
PWC | https://paperswithcode.com/paper/symbol-emergence-as-an-interpersonal |
Repo | |
Framework | |
Exploiting bilateral symmetry in brain lesion segmentation
Title | Exploiting bilateral symmetry in brain lesion segmentation |
Authors | Kevin Raina, Uladzimir Yahorau, Tanya Schmah |
Abstract | Brain lesions, including stroke and tumours, have a high degree of variability in terms of location, size, intensity and form, making automatic segmentation difficult. We propose an improvement to existing segmentation methods by exploiting the bilateral quasi-symmetry of healthy brains, which breaks down when lesions are present. Specifically, we use nonlinear registration of a neuroimage to a reflected version of itself (“reflective registration”) to determine for each voxel its homologous (corresponding) voxel in the other hemisphere. A patch around the homologous voxel is added as a set of new features to the segmentation algorithm. To evaluate this method, we implemented two different CNN-based multimodal MRI stroke lesion segmentation algorithms, and then augmented them by adding extra symmetry features using the reflective registration method described above. For each architecture, we compared the performance with and without symmetry augmentation, on the SISS Training dataset of the Ischemic Stroke Lesion Segmentation Challenge (ISLES) 2015 challenge. Using affine reflective registration improves performance over baseline, but nonlinear reflective registration gives significantly better results: an improvement in Dice coefficient of 13 percentage points over baseline for one architecture and 9 points for the other. We argue for the broad applicability of adding symmetric features to existing segmentation algorithms, specifically using nonlinear, template-free methods. |
Tasks | Ischemic Stroke Lesion Segmentation, Lesion Segmentation |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08196v1 |
https://arxiv.org/pdf/1907.08196v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-bilateral-symmetry-in-brain-lesion |
Repo | |
Framework | |
Modeling Music Modality with a Key-Class Invariant Pitch Chroma CNN
Title | Modeling Music Modality with a Key-Class Invariant Pitch Chroma CNN |
Authors | Anders Elowsson, Anders Friberg |
Abstract | This paper presents a convolutional neural network (CNN) that uses input from a polyphonic pitch estimation system to predict perceived minor/major modality in music audio. The pitch activation input is structured to allow the first CNN layer to compute two pitch chromas focused on different octaves. The following layers perform harmony analysis across chroma and time scales. Through max pooling across pitch, the CNN becomes invariant with regards to the key class (i.e., key disregarding mode) of the music. A multilayer perceptron combines the modality activation output with spectral features for the final prediction. The study uses a dataset of 203 excerpts rated by around 20 listeners each, a small challenging data size requiring a carefully designed parameter sharing. With an R2 of about 0.71, the system clearly outperforms previous systems as well as individual human listeners. A final ablation study highlights the importance of using pitch activations processed across longer time scales, and using pooling to facilitate invariance with regards to the key class. |
Tasks | |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.07145v1 |
https://arxiv.org/pdf/1906.07145v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-music-modality-with-a-key-class |
Repo | |
Framework | |
Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild
Title | Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild |
Authors | Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi |
Abstract | We propose a method to learn 3D deformable object categories from raw single-view images, without external supervision. The method is based on an autoencoder that factors each input image into depth, albedo, viewpoint and illumination. In order to disentangle these components without supervision, we use the fact that many object categories have, at least in principle, a symmetric structure. We show that reasoning about illumination allows us to exploit the underlying object symmetry even if the appearance is not symmetric due to shading. Furthermore, we model objects that are probably, but not certainly, symmetric by predicting a symmetry probability map, learned end-to-end with the other components of the model. Our experiments show that this method can recover very accurately the 3D shape of human faces, cat faces and cars from single-view images, without any supervision or a prior shape model. On benchmarks, we demonstrate superior accuracy compared to another method that uses supervision at the level of 2D image correspondences. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11130v2 |
https://arxiv.org/pdf/1911.11130v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-probably-symmetric |
Repo | |
Framework | |
Center and Scale Prediction: A Box-free Approach for Pedestrian and Face Detection
Title | Center and Scale Prediction: A Box-free Approach for Pedestrian and Face Detection |
Authors | Wei Liu, Irtiza Hasan, Shengcai Liao |
Abstract | Object detection generally requires sliding-window classifiers in tradition or anchor box based predictions in modern deep learning approaches. However, either of these approaches requires tedious configurations in boxes. In this paper, we provide a new perspective where detecting objects is motivated as a high-level semantic feature detection task. Like edges, corners, blobs and other feature detectors, the proposed detector scans for feature points all over the image, for which the convolution is naturally suited. However, unlike these traditional low-level features, the proposed detector goes for a higher-level abstraction, that is, we are looking for central points where there are objects, and modern deep models are already capable of such a high-level semantic abstraction. Besides, like blob detection, we also predict the scales of the central points, which is also a straightforward convolution. Therefore, in this paper, pedestrian and face detection is simplified as a straightforward center and scale prediction task through convolutions. This way, the proposed method enjoys a box-free setting. Though structurally simple, it presents competitive accuracy on several challenging benchmarks, including pedestrian detection and face detection. Furthermore, a cross-dataset evaluation is performed, demonstrating a superior generalization ability of the proposed method |
Tasks | Face Detection, Object Detection, Pedestrian Detection |
Published | 2019-04-05 |
URL | https://arxiv.org/abs/1904.02948v3 |
https://arxiv.org/pdf/1904.02948v3.pdf | |
PWC | https://paperswithcode.com/paper/high-level-semantic-feature-detectiona-new |
Repo | |
Framework | |
Bounding the error of discretized Langevin algorithms for non-strongly log-concave targets
Title | Bounding the error of discretized Langevin algorithms for non-strongly log-concave targets |
Authors | Arnak S. Dalalyan, Lionel Riou-Durand, Avetik Karagulyan |
Abstract | In this paper, we provide non-asymptotic upper bounds on the error of sampling from a target density using three schemes of discretized Langevin diffusions. The first scheme is the Langevin Monte Carlo (LMC) algorithm, the Euler discretization of the Langevin diffusion. The second and the third schemes are, respectively, the kinetic Langevin Monte Carlo (KLMC) for differentiable potentials and the kinetic Langevin Monte Carlo for twice-differentiable potentials (KLMC2). The main focus is on the target densities that are smooth and log-concave on $\RR^p$, but not necessarily strongly log-concave. Bounds on the computational complexity are obtained under two types of smoothness assumption: the potential has a Lipschitz-continuous gradient and the potential has a Lipschitz-continuous Hessian matrix. The error of sampling is measured by Wasserstein-$q$ distances and the bounded-Lipschitz distance. We advocate for the use of a new dimension-adapted scaling in the definition of the computational complexity, when Wasserstein-$q$ distances are considered. The obtained results show that the number of iterations to achieve a scaled-error smaller than a prescribed value depends only polynomially in the dimension. |
Tasks | |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08530v1 |
https://arxiv.org/pdf/1906.08530v1.pdf | |
PWC | https://paperswithcode.com/paper/bounding-the-error-of-discretized-langevin |
Repo | |
Framework | |
Robust and Discriminative Labeling for Multi-label Active Learning Based on Maximum Correntropy Criterion
Title | Robust and Discriminative Labeling for Multi-label Active Learning Based on Maximum Correntropy Criterion |
Authors | Bo Du, Zengmao Wang, Lefei Zhang, Liangpei Zhang, Dacheng Tao |
Abstract | Multi-label learning draws great interests in many real world applications. It is a highly costly task to assign many labels by the oracle for one instance. Meanwhile, it is also hard to build a good model without diagnosing discriminative labels. Can we reduce the label costs and improve the ability to train a good model for multi-label learning simultaneously? Active learning addresses the less training samples problem by querying the most valuable samples to achieve a better performance with little costs. In multi-label active learning, some researches have been done for querying the relevant labels with less training samples or querying all labels without diagnosing the discriminative information. They all cannot effectively handle the outlier labels for the measurement of uncertainty. Since Maximum Correntropy Criterion (MCC) provides a robust analysis for outliers in many machine learning and data mining algorithms, in this paper, we derive a robust multi-label active learning algorithm based on MCC by merging uncertainty and representativeness, and propose an efficient alternating optimization method to solve it. With MCC, our method can eliminate the influence of outlier labels that are not discriminative to measure the uncertainty. To make further improvement on the ability of information measurement, we merge uncertainty and representativeness with the prediction labels of unknown data. It can not only enhance the uncertainty but also improve the similarity measurement of multi-label data with labels information. Experiments on benchmark multi-label data sets have shown a superior performance than the state-of-the-art methods. |
Tasks | Active Learning, Multi-Label Learning |
Published | 2019-04-14 |
URL | http://arxiv.org/abs/1904.06689v1 |
http://arxiv.org/pdf/1904.06689v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-and-discriminative-labeling-for-multi |
Repo | |
Framework | |
From GAN to WGAN
Title | From GAN to WGAN |
Authors | Lilian Weng |
Abstract | This paper explains the math behind a generative adversarial network (GAN) model and why it is hard to be trained. Wasserstein GAN is intended to improve GANs’ training by adopting a smooth metric for measuring the distance between two probability distributions. |
Tasks | |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08994v1 |
http://arxiv.org/pdf/1904.08994v1.pdf | |
PWC | https://paperswithcode.com/paper/from-gan-to-wgan |
Repo | |
Framework | |
Preference-Informed Fairness
Title | Preference-Informed Fairness |
Authors | Michael P. Kim, Aleksandra Korolova, Guy N. Rothblum, Gal Yona |
Abstract | We study notions of fairness in decision-making systems when individuals have diverse preferences over the possible outcomes of the decisions. Our starting point is the seminal work of Dwork et al. which introduced a notion of individual fairness (IF): given a task-specific similarity metric, every pair of individuals who are similarly qualified according to the metric should receive similar outcomes. We show that when individuals have diverse preferences over outcomes, requiring IF may unintentionally lead to less-preferred outcomes for the very individuals that IF aims to protect. A natural alternative to IF is the classic notion of fair division, envy-freeness (EF): no individual should prefer another individual’s outcome over their own. Although EF allows for solutions where all individuals receive a highly-preferred outcome, EF may also be overly-restrictive. For instance, if many individuals agree on the best outcome, then if any individual receives this outcome, they all must receive it, regardless of each individual’s underlying qualifications for the outcome. We introduce and study a new notion of preference-informed individual fairness (PIIF) that is a relaxation of both individual fairness and envy-freeness. At a high-level, PIIF requires that outcomes satisfy IF-style constraints, but allows for deviations provided they are in line with individuals’ preferences. We show that PIIF can permit outcomes that are more favorable to individuals than any IF solution, while providing considerably more flexibility to the decision-maker than EF. In addition, we show how to efficiently optimize any convex objective over the outcomes subject to PIIF for a rich class of individual preferences. Finally, we demonstrate the broad applicability of the PIIF framework by extending our definitions and algorithms to the multiple-task targeted advertising setting introduced by Dwork and Ilvento. |
Tasks | Decision Making |
Published | 2019-04-03 |
URL | https://arxiv.org/abs/1904.01793v2 |
https://arxiv.org/pdf/1904.01793v2.pdf | |
PWC | https://paperswithcode.com/paper/preference-informed-fairness |
Repo | |
Framework | |