Paper Group AWR 251
Evaluating Non-aligned Musical Score Transcriptions with MV2H. MIM: Mutual Information Machine. StructEdit: Learning Structural Shape Variations. Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models. Ward2ICU: A Vital Signs Dataset of Inpatients from the General Ward. Learning a Local Symmetry with Neural-Networks. Hardne …
Evaluating Non-aligned Musical Score Transcriptions with MV2H
Title | Evaluating Non-aligned Musical Score Transcriptions with MV2H |
Authors | Andrew McLeod |
Abstract | The original MV2H metric was designed to evaluate systems which transcribe from an input audio (or MIDI) piece to a complete musical score. However, it requires both the transcribed score and the ground truth score to be time-aligned with the input. Some recent work has begun to transcribe directly from an audio signal into a musical score, skipping the alignment step. This paper introduces an automatic alignment method based on dynamic time warp which allows for MV2H to be used to evaluate such non-aligned transcriptions. This has the additional benefit of allowing non-aligned musical scores—which are significantly more widely available than aligned ones—to be used as ground truth. The code for the improved MV2H, which now also includes a MusicXML parser, and allows for key and time signature changes, is available at www.github.com/apmcleod/MV2H. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00566v2 |
https://arxiv.org/pdf/1906.00566v2.pdf | |
PWC | https://paperswithcode.com/paper/190600566 |
Repo | https://github.com/apmcleod/MV2H |
Framework | none |
MIM: Mutual Information Machine
Title | MIM: Mutual Information Machine |
Authors | Micha Livne, Kevin Swersky, David J. Fleet |
Abstract | We introduce the Mutual Information Machine (MIM), a probabilistic auto-encoder for learning joint distributions over observations and latent variables. MIM reflects three design principles: 1) low divergence, to encourage the encoder and decoder to learn consistent factorizations of the same underlying distribution; 2) high mutual information, to encourage an informative relation between data and latent variables; and 3) low marginal entropy, or compression, which tends to encourage clustered latent representations. We show that a combination of the Jensen-Shannon divergence and the joint entropy of the encoding and decoding distributions satisfies these criteria, and admits a tractable cross-entropy bound that can be optimized directly with Monte Carlo and stochastic gradient descent. We contrast MIM learning with maximum likelihood and VAEs. Experiments show that MIM learns representations with high mutual information, consistent encoding and decoding distributions, effective latent clustering, and data log likelihood comparable to VAE, while avoiding posterior collapse. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03175v5 |
https://arxiv.org/pdf/1910.03175v5.pdf | |
PWC | https://paperswithcode.com/paper/mim-mutual-information-machine |
Repo | https://github.com/seraphlabs-ca/MIM |
Framework | pytorch |
StructEdit: Learning Structural Shape Variations
Title | StructEdit: Learning Structural Shape Variations |
Authors | Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas |
Abstract | Learning to encode differences in the geometry and (topological) structure of the shapes of ordinary objects is key to generating semantically plausible variations of a given shape, transferring edits from one shape to another, and many other applications in 3D content creation. The common approach of encoding shapes as points in a high-dimensional latent feature space suggests treating shape differences as vectors in that space. Instead, we treat shape differences as primary objects in their own right and propose to encode them in their own latent space. In a setting where the shapes themselves are encoded in terms of fine-grained part hierarchies, we demonstrate that a separate encoding of shape deltas or differences provides a principled way to deal with inhomogeneities in the shape space due to different combinatorial part structures, while also allowing for compactness in the representation, as well as edit abstraction and transfer. Our approach is based on a conditional variational autoencoder for encoding and decoding shape deltas, conditioned on a source shape. We demonstrate the effectiveness and robustness of our approach in multiple shape modification and generation tasks, and provide comparison and ablation studies on the PartNet dataset, one of the largest publicly available 3D datasets. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11098v1 |
https://arxiv.org/pdf/1911.11098v1.pdf | |
PWC | https://paperswithcode.com/paper/structedit-learning-structural-shape |
Repo | https://github.com/hyzcn/structedit |
Framework | none |
Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models
Title | Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models |
Authors | Vincent Le Guen, Nicolas Thome |
Abstract | This paper addresses the problem of time series forecasting for non-stationary signals and multiple future steps prediction. To handle this challenging task, we introduce DILATE (DIstortion Loss including shApe and TimE), a new objective function for training deep neural networks. DILATE aims at accurately predicting sudden changes, and explicitly incorporates two terms supporting precise shape and temporal change detection. We introduce a differentiable loss function suitable for training deep neural nets, and provide a custom back-prop implementation for speeding up optimization. We also introduce a variant of DILATE, which provides a smooth generalization of temporally-constrained Dynamic Time Warping (DTW). Experiments carried out on various non-stationary datasets reveal the very good behaviour of DILATE compared to models trained with the standard Mean Squared Error (MSE) loss function, and also to DTW and variants. DILATE is also agnostic to the choice of the model, and we highlight its benefit for training fully connected networks as well as specialized recurrent architectures, showing its capacity to improve over state-of-the-art trajectory forecasting approaches. |
Tasks | Time Series, Time Series Forecasting |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.09020v4 |
https://arxiv.org/pdf/1909.09020v4.pdf | |
PWC | https://paperswithcode.com/paper/shape-and-time-distortion-loss-for-training |
Repo | https://github.com/vincent-leguen/STDL |
Framework | pytorch |
Ward2ICU: A Vital Signs Dataset of Inpatients from the General Ward
Title | Ward2ICU: A Vital Signs Dataset of Inpatients from the General Ward |
Authors | Daniel Severo, Flávio Amaro, Estevam R. Hruschka Jr, André Soares de Moura Costa |
Abstract | We present a proxy dataset of vital signs with class labels indicating patient transitions from the ward to intensive care units called Ward2ICU. Patient privacy is protected using a Wasserstein Generative Adversarial Network to implicitly learn an approximation of the data distribution, allowing us to sample synthetic data. The quality of data generation is assessed directly on the binary classification task by comparing specificity and sensitivity of an LSTM classifier on proxy and original datasets. We initialize a discussion of unintentionally disclosing commercial sensitive information and propose a solution for a special case through class label balancing |
Tasks | Predicting Patient Outcomes |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.00752v1 |
https://arxiv.org/pdf/1910.00752v1.pdf | |
PWC | https://paperswithcode.com/paper/ward2icu-a-vital-signs-dataset-of-inpatients |
Repo | https://github.com/3778/Ward2ICU |
Framework | pytorch |
Learning a Local Symmetry with Neural-Networks
Title | Learning a Local Symmetry with Neural-Networks |
Authors | Aurélien Decelle, Victor Martin-Mayor, Beatriz Seoane |
Abstract | We explore the capacity of neural networks to detect a symmetry with complex local and non-local patterns : the gauge symmetry Z 2 . This symmetry is present in physical problems from topological transitions to QCD, and controls the computational hardness of instances of spin-glasses. Here, we show how to design a neural network, and a dataset, able to learn this symmetry and to find compressed latent representations of the gauge orbits. Our method pays special attention to system-wrapping loops, the so-called Polyakov loops, known to be particularly relevant for computational complexity. |
Tasks | |
Published | 2019-04-16 |
URL | https://arxiv.org/abs/1904.07637v2 |
https://arxiv.org/pdf/1904.07637v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-gauge-symmetry-with-neural |
Repo | https://github.com/AurelienDecelle/SpinLearning |
Framework | none |
Hardness-Aware Deep Metric Learning
Title | Hardness-Aware Deep Metric Learning |
Authors | Wenzhao Zheng, Zhaodong Chen, Jiwen Lu, Jie Zhou |
Abstract | This paper presents a hardness-aware deep metric learning (HDML) framework. Most previous deep metric learning methods employ the hard negative mining strategy to alleviate the lack of informative samples for training. However, this mining strategy only utilizes a subset of training data, which may not be enough to characterize the global geometry of the embedding space comprehensively. To address this problem, we perform linear interpolation on embeddings to adaptively manipulate their hard levels and generate corresponding label-preserving synthetics for recycled training, so that information buried in all samples can be fully exploited and the metric is always challenged with proper difficulty. Our method achieves very competitive performance on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets. |
Tasks | Image Retrieval, Metric Learning |
Published | 2019-03-13 |
URL | https://arxiv.org/abs/1903.05503v2 |
https://arxiv.org/pdf/1903.05503v2.pdf | |
PWC | https://paperswithcode.com/paper/hardness-aware-deep-metric-learning |
Repo | https://github.com/neka-nat/pytorch-hdml |
Framework | pytorch |
Venue Analytics: A Simple Alternative to Citation-Based Metrics
Title | Venue Analytics: A Simple Alternative to Citation-Based Metrics |
Authors | Leonid Keselman |
Abstract | We present a method for automatically organizing and evaluating the quality of different publishing venues in Computer Science. Since this method only requires paper publication data as its input, we can demonstrate our method on a large portion of the DBLP dataset, spanning 50 years, with millions of authors and thousands of publishing venues. By formulating venue authorship as a regression problem and targeting metrics of interest, we obtain venue scores for every conference and journal in our dataset. The obtained scores can also provide a per-year model of conference quality, showing how fields develop and change over time. Additionally, these venue scores can be used to evaluate individual academic authors and academic institutions. We show that using venue scores to evaluate both authors and institutions produces quantitative measures that are comparable to approaches using citations or peer assessment. In contrast to many other existing evaluation metrics, our use of large-scale, openly available data enables this approach to be repeatable and transparent. To help others build upon this work, all of our code and data is available at https://github.com/leonidk/venue_scores |
Tasks | |
Published | 2019-04-29 |
URL | https://arxiv.org/abs/1904.12573v2 |
https://arxiv.org/pdf/1904.12573v2.pdf | |
PWC | https://paperswithcode.com/paper/venue-analytics-a-simple-alternative-to |
Repo | https://github.com/leonidk/venue_scores |
Framework | none |
Canonicalizing Knowledge Base Literals
Title | Canonicalizing Knowledge Base Literals |
Authors | Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks |
Abstract | Ontology-based knowledge bases (KBs) like DBpedia are very valuable resources, but their usefulness and usability is limited by various quality issues. One such issue is the use of string literals instead of semantically typed entities. In this paper we study the automated canonicalization of such literals, i.e., replacing the literal with an existing entity from the KB or with a new entity that is typed using classes from the KB. We propose a framework that combines both reasoning and machine learning in order to predict the relevant entities and types, and we evaluate this framework against state-of-the-art baselines for both semantic typing and entity matching. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11180v1 |
https://arxiv.org/pdf/1906.11180v1.pdf | |
PWC | https://paperswithcode.com/paper/canonicalizing-knowledge-base-literals |
Repo | https://github.com/ChenJiaoyan/KG_Curation |
Framework | none |
Augment your batch: better training with larger batches
Title | Augment your batch: better training with larger batches |
Authors | Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry |
Abstract | Large-batch SGD is important for scaling training of deep neural networks. However, without fine-tuning hyperparameter schedules, the generalization of the model may be hampered. We propose to use batch augmentation: replicating instances of samples within the same batch with different data augmentations. Batch augmentation acts as a regularizer and an accelerator, increasing both generalization and performance scaling. We analyze the effect of batch augmentation on gradient variance and show that it empirically improves convergence for a wide variety of deep neural networks and datasets. Our results show that batch augmentation reduces the number of necessary SGD updates to achieve the same accuracy as the state-of-the-art. Overall, this simple yet effective method enables faster training and better generalization by allowing more computational resources to be used concurrently. |
Tasks | |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1901.09335v1 |
http://arxiv.org/pdf/1901.09335v1.pdf | |
PWC | https://paperswithcode.com/paper/augment-your-batch-better-training-with |
Repo | https://github.com/vaapopescu/gradient-pruning |
Framework | pytorch |
Encoding Database Schemas with Relation-Aware Self-Attention for Text-to-SQL Parsers
Title | Encoding Database Schemas with Relation-Aware Self-Attention for Text-to-SQL Parsers |
Authors | Richard Shin |
Abstract | When translating natural language questions into SQL queries to answer questions from a database, we would like our methods to generalize to domains and database schemas outside of the training set. To handle complex questions and database schemas with a neural encoder-decoder paradigm, it is critical to properly encode the schema as part of the input with the question. In this paper, we use relation-aware self-attention within the encoder so that it can reason about how the tables and columns in the provided schema relate to each other and use this information in interpreting the question. We achieve significant gains on the recently-released Spider dataset with 42.94% exact match accuracy, compared to the 18.96% reported in published work. |
Tasks | Text-To-Sql |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11790v1 |
https://arxiv.org/pdf/1906.11790v1.pdf | |
PWC | https://paperswithcode.com/paper/encoding-database-schemas-with-relation-aware |
Repo | https://github.com/rshin/seq2struct |
Framework | tf |
SHACL Constraints with Inference Rules
Title | SHACL Constraints with Inference Rules |
Authors | Paolo Pareti, George Konstantinidis, Timothy J. Norman, Murat Şensoy |
Abstract | The Shapes Constraint Language (SHACL) has been recently introduced as a W3C recommendation to define constraints that can be validated against RDF graphs. Interactions of SHACL with other Semantic Web technologies, such as ontologies or reasoners, is a matter of ongoing research. In this paper we study the interaction of a subset of SHACL with inference rules expressed in datalog. On the one hand, SHACL constraints can be used to define a “schema” for graph datasets. On the other hand, inference rules can lead to the discovery of new facts that do not match the original schema. Given a set of SHACL constraints and a set of datalog rules, we present a method to detect which constraints could be violated by the application of the inference rules on some graph instance of the schema, and update the original schema, i.e, the set of SHACL constraints, in order to capture the new facts that can be inferred. We provide theoretical and experimental results of the various components of our approach. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00598v1 |
https://arxiv.org/pdf/1911.00598v1.pdf | |
PWC | https://paperswithcode.com/paper/shacl-constraints-with-inference-rules |
Repo | https://github.com/paolo7/ISWC2019-code |
Framework | none |
Tracing cultural diachronic semantic shifts in Russian using word embeddings: test sets and baselines
Title | Tracing cultural diachronic semantic shifts in Russian using word embeddings: test sets and baselines |
Authors | Vadim Fomin, Daria Bakshandaeva, Julia Rodina, Andrey Kutuzov |
Abstract | The paper introduces manually annotated test sets for the task of tracing diachronic (temporal) semantic shifts in Russian. The two test sets are complementary in that the first one covers comparatively strong semantic changes occurring to nouns and adjectives from pre-Soviet to Soviet times, while the second one covers comparatively subtle socially and culturally determined shifts occurring in years from 2000 to 2014. Additionally, the second test set offers more granular classification of shifts degree, but is limited to only adjectives. The introduction of the test sets allowed us to evaluate several well-established algorithms of semantic shifts detection (posing this as a classification problem), most of which have never been tested on Russian material. All of these algorithms use distributional word embedding models trained on the corresponding in-domain corpora. The resulting scores provide solid comparison baselines for future studies tackling similar tasks. We publish the datasets, code and the trained models in order to facilitate further research in automatically detecting temporal semantic shifts for Russian words, with time periods of different granularities. |
Tasks | Word Embeddings |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.06837v2 |
https://arxiv.org/pdf/1905.06837v2.pdf | |
PWC | https://paperswithcode.com/paper/tracing-cultural-diachronic-semantic-shifts |
Repo | https://github.com/wadimiusz/diachrony_for_russian |
Framework | none |
Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement
Title | Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement |
Authors | Wouter Kool, Herke van Hoof, Max Welling |
Abstract | The well-known Gumbel-Max trick for sampling from a categorical distribution can be extended to sample $k$ elements without replacement. We show how to implicitly apply this ‘Gumbel-Top-$k$’ trick on a factorized distribution over sequences, allowing to draw exact samples without replacement using a Stochastic Beam Search. Even for exponentially large domains, the number of model evaluations grows only linear in $k$ and the maximum sampled sequence length. The algorithm creates a theoretical connection between sampling and (deterministic) beam search and can be used as a principled intermediate alternative. In a translation task, the proposed method compares favourably against alternatives to obtain diverse yet good quality translations. We show that sequences sampled without replacement can be used to construct low-variance estimators for expected sentence-level BLEU score and model entropy. |
Tasks | |
Published | 2019-03-14 |
URL | https://arxiv.org/abs/1903.06059v2 |
https://arxiv.org/pdf/1903.06059v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-beams-and-where-to-find-them-the |
Repo | https://github.com/wouterkool/estimating-gradients-without-replacement |
Framework | pytorch |
DeepNovoV2: Better de novo peptide sequencing with deep learning
Title | DeepNovoV2: Better de novo peptide sequencing with deep learning |
Authors | Rui Qiao, Ngoc Hieu Tran, Lei Xin, Baozhen Shan, Ming Li, Ali Ghodsi |
Abstract | Personalized cancer vaccines are envisioned as the next generation rational cancer immunotherapy. The key step in developing personalized therapeutic cancer vaccines is to identify tumor-specific neoantigens that are on the surface of tumor cells. A promising method for this is through de novo peptide sequencing from mass spectrometry data. In this paper we introduce DeepNovoV2, the state-of-the-art model for peptide sequencing. In DeepNovoV2, a spectrum is directly represented as a set of (m/z, intensity) pairs, therefore it does not suffer from the accuracy-speed/memory trade-off problem. The model combines an order invariant network structure (T-Net) and recurrent neural networks and provides a complete end-to-end training and prediction framework to sequence patterns of peptides. Our experiments on a wide variety of data from different species show that DeepNovoV2 outperforms previous state-of-the-art methods, achieving 13.01-23.95% higher accuracy at the peptide level. |
Tasks | |
Published | 2019-04-17 |
URL | https://arxiv.org/abs/1904.08514v2 |
https://arxiv.org/pdf/1904.08514v2.pdf | |
PWC | https://paperswithcode.com/paper/deepnovov2-better-de-novo-peptide-sequencing |
Repo | https://github.com/volpato30/DeepNovoV2 |
Framework | pytorch |