Paper Group ANR 1075
Mixture-Model-based Bounding Box Density Estimation for Object Detection. Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. Knowledge Graph Embedding Bi-Vector Models for Symmetric Relation. MinWikiSplit: …
Mixture-Model-based Bounding Box Density Estimation for Object Detection
Title | Mixture-Model-based Bounding Box Density Estimation for Object Detection |
Authors | Jaeyoung Yoo, Geonseok Seo, Inseop Chung, Nojun Kwak |
Abstract | In this paper, we reformulate the multi-object detection task as density estimation of bounding boxes. We propose a new object detection network, Mixture-Model-based Object Detector (MMOD), that performs multi-object detection through density estimation using a mixture model. MMOD captures this conditional distribution of bounding boxes for a given input image using a mixture model consisting of Gaussian and categorical distributions. In doing so, we also propose a new network structure and objective function for the MMOD. MMOD is not trained by assigning a ground truth bounding box to the specific locations of the network’s output. Instead, the mixture components are automatically learned to represent the distribution of the bounding box through density estimation. In this way, MMOD is not only trained without ground truth assignment but also does not suffer from foreground-background imbalance problem, since background bounding boxes are stochastically sampled from the mixture model that estimates ground truth bounding box distribution. We applied MMOD to MS COCO and Pascal VOC datasets, and observed that MMOD outperforms other detection methods in terms of speed and performance trade-offs. Code will be available. |
Tasks | Density Estimation, Object Detection, Real-Time Object Detection |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12721v2 |
https://arxiv.org/pdf/1911.12721v2.pdf | |
PWC | https://paperswithcode.com/paper/mixture-model-based-bounding-box-density |
Repo | |
Framework | |
Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets
Title | Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets |
Authors | Chris Dulhanty, Alexander Wong |
Abstract | The ImageNet dataset ushered in a flood of academic and industry interest in deep learning for computer vision applications. Despite its significant impact, there has not been a comprehensive investigation into the demographic attributes of images contained within the dataset. Such a study could lead to new insights on inherent biases within ImageNet, particularly important given it is frequently used to pretrain models for a wide variety of computer vision tasks. In this work, we introduce a model-driven framework for the automatic annotation of apparent age and gender attributes in large-scale image datasets. Using this framework, we conduct the first demographic audit of the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) subset of ImageNet and the “person” hierarchical category of ImageNet. We find that 41.62% of faces in ILSVRC appear as female, 1.71% appear as individuals above the age of 60, and males aged 15 to 29 account for the largest subgroup with 27.11%. We note that the presented model-driven framework is not fair for all intersectional groups, so annotation are subject to bias. We present this work as the starting point for future development of unbiased annotation models and for the study of downstream effects of imbalances in the demographics of ImageNet. Code and annotations are available at: http://bit.ly/ImageNetDemoAudit |
Tasks | Object Recognition |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01347v2 |
https://arxiv.org/pdf/1905.01347v2.pdf | |
PWC | https://paperswithcode.com/paper/auditing-imagenet-towards-a-model-driven |
Repo | |
Framework | |
Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding
Title | Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding |
Authors | Ninghao Liu, Qiaoyu Tan, Yuening Li, Hongxia Yang, Jingren Zhou, Xia Hu |
Abstract | Networks have been widely used as the data structure for abstracting real-world systems as well as organizing the relations among entities. Network embedding models are powerful tools in mapping nodes in a network into continuous vector-space representations in order to facilitate subsequent tasks such as classification and link prediction. Existing network embedding models comprehensively integrate all information of each node, such as links and attributes, towards a single embedding vector to represent the node’s general role in the network. However, a real-world entity could be multifaceted, where it connects to different neighborhoods due to different motives or self-characteristics that are not necessarily correlated. For example, in a movie recommender system, a user may love comedies or horror movies simultaneously, but it is not likely that these two types of movies are mutually close in the embedding space, nor the user embedding vector could be sufficiently close to them at the same time. In this paper, we propose a polysemous embedding approach for modeling multiple facets of nodes, as motivated by the phenomenon of word polysemy in language modeling. Each facet of a node is mapped as an embedding vector, while we also maintain association degree between each pair of node and facet. The proposed method is adaptive to various existing embedding models, without significantly complicating the optimization process. We also discuss how to engage embedding vectors of different facets for inference tasks including classification and link prediction. Experiments on real-world datasets help comprehensively evaluate the performance of the proposed method. |
Tasks | Language Modelling, Link Prediction, Network Embedding, Recommendation Systems |
Published | 2019-05-25 |
URL | https://arxiv.org/abs/1905.10668v1 |
https://arxiv.org/pdf/1905.10668v1.pdf | |
PWC | https://paperswithcode.com/paper/is-a-single-vector-enough-exploring-node |
Repo | |
Framework | |
Knowledge Graph Embedding Bi-Vector Models for Symmetric Relation
Title | Knowledge Graph Embedding Bi-Vector Models for Symmetric Relation |
Authors | Jinkui Yao, Lianghua Xu |
Abstract | Knowledge graph embedding (KGE) models have been proposed to improve the performance of knowledge graph reasoning. However, there is a general phenomenon in most of KGEs, as the training progresses, the symmetric relations tend to zero vector, if the symmetric triples ratio is high enough in the dataset. This phenomenon causes subsequent tasks, e.g. link prediction etc., of symmetric relations to fail. The root cause of the problem is that KGEs do not utilize the semantic information of symmetric relations. We propose KGE bi-vector models, which represent the symmetric relations as vector pair, significantly increasing the processing capability of the symmetry relations. We generate the benchmark datasets based on FB15k and WN18 by completing the symmetric relation triples to verify models. The experiment results of our models clearly affirm the effectiveness and superiority of our models against baseline. |
Tasks | Graph Embedding, Knowledge Graph Embedding, Link Prediction |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09557v1 |
https://arxiv.org/pdf/1905.09557v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-graph-embedding-bi-vector-models |
Repo | |
Framework | |
MinWikiSplit: A Sentence Splitting Corpus with Minimal Propositions
Title | MinWikiSplit: A Sentence Splitting Corpus with Minimal Propositions |
Authors | Christina Niklaus, Andre Freitas, Siegfried Handschuh |
Abstract | We compiled a new sentence splitting corpus that is composed of 203K pairs of aligned complex source and simplified target sentences. Contrary to previously proposed text simplification corpora, which contain only a small number of split examples, we present a dataset where each input sentence is broken down into a set of minimal propositions, i.e. a sequence of sound, self-contained utterances with each of them presenting a minimal semantic unit that cannot be further decomposed into meaningful propositions. This corpus is useful for developing sentence splitting approaches that learn how to transform sentences with a complex linguistic structure into a fine-grained representation of short sentences that present a simple and more regular structure which is easier to process for downstream applications and thus facilitates and improves their performance. |
Tasks | Text Simplification |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12131v1 |
https://arxiv.org/pdf/1909.12131v1.pdf | |
PWC | https://paperswithcode.com/paper/minwikisplit-a-sentence-splitting-corpus-with |
Repo | |
Framework | |
SpeechBERT: Cross-Modal Pre-trained Language Model for End-to-end Spoken Question Answering
Title | SpeechBERT: Cross-Modal Pre-trained Language Model for End-to-end Spoken Question Answering |
Authors | Yung-Sung Chuang, Chi-Liang Liu, Hung-Yi Lee |
Abstract | While end-to-end models for spoken language understanding tasks have been explored recently, there is still no end-to-end model for spoken question answering (SQA) tasks, which would be catastrophically influenced by speech recognition errors. Meanwhile, pre-trained language models, such as BERT, have performed successfully in text question answering. To bring this advantage of pre-trained language models into spoken question answering, we propose SpeechBERT, a cross-modal transformer-based pre-trained language model. Our model can outperform conventional approaches on the dataset which contains both correctly recognized answers and incorrectly recognized answers. Our experimental results show the potential of end-to-end SQA models. |
Tasks | Language Modelling, Question Answering, Speech Recognition, Spoken Language Understanding |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11559v2 |
https://arxiv.org/pdf/1910.11559v2.pdf | |
PWC | https://paperswithcode.com/paper/speechbert-cross-modal-pre-trained-language |
Repo | |
Framework | |
Evaluation Function Approximation for Scrabble
Title | Evaluation Function Approximation for Scrabble |
Authors | Rishabh Agarwal |
Abstract | The current state-of-the-art Scrabble agents are not learning-based but depend on truncated Monte Carlo simulations and the quality of such agents is contingent upon the time available for running the simulations. This thesis takes steps towards building a learning-based Scrabble agent using self-play. Specifically, we try to find a better function approximation for the static evaluation function used in Scrabble which determines the move goodness at a given board configuration. In this work, we experimented with evolutionary algorithms and Bayesian Optimization to learn the weights for an approximate feature-based evaluation function. However, these optimization methods were not quite effective, which lead us to explore the given problem from an Imitation Learning point of view. We also tried to imitate the ranking of moves produced by the Quackle simulation agent using supervised learning with a neural network function approximator which takes the raw representation of the Scrabble board as the input instead of using only a fixed number of handcrafted features. |
Tasks | Imitation Learning |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.08728v1 |
http://arxiv.org/pdf/1901.08728v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluation-function-approximation-for |
Repo | |
Framework | |
An image structure model for exact edge detection
Title | An image structure model for exact edge detection |
Authors | Alessandro Dal Palu’ |
Abstract | The paper presents a new model for single channel images low-level interpretation. The image is decomposed into a graph which captures a complete set of structural features. The description allows to accurately identify every edge location and its correct connectivity. The key features of the method are: vector description of the edges, subpixel precision, and parallelism of the underlying algorithm. The methodology outperforms classical and state of the art edge detectors at both conceptual and experimental levels. It also enables graph based algorithms for higher-level feature extraction. Any image processing pipeline can benefit from such results: e.g., controlled denoising, edge preserving filtering, upsampling, compression, vector and graph based pattern matching, neural network training. |
Tasks | Denoising, Edge Detection |
Published | 2019-04-21 |
URL | http://arxiv.org/abs/1904.09659v1 |
http://arxiv.org/pdf/1904.09659v1.pdf | |
PWC | https://paperswithcode.com/paper/an-image-structure-model-for-exact-edge |
Repo | |
Framework | |
Clustering Images by Unmasking - A New Baseline
Title | Clustering Images by Unmasking - A New Baseline |
Authors | Mariana-Iuliana Georgescu, Radu Tudor Ionescu |
Abstract | We propose a novel agglomerative clustering method based on unmasking, a technique that was previously used for authorship verification of text documents and for abnormal event detection in videos. In order to join two clusters, we alternate between (i) training a binary classifier to distinguish between the samples from one cluster and the samples from the other cluster, and (ii) removing at each step the most discriminant features. The faster-decreasing accuracy rates of the intermediately-obtained classifiers indicate that the two clusters should be joined. To the best of our knowledge, this is the first work to apply unmasking in order to cluster images. We compare our method with k-means as well as a recent state-of-the-art clustering method. The empirical results indicate that our approach is able to improve performance for various (deep and shallow) feature representations and different tasks, such as handwritten digit recognition, texture classification and fine-grained object recognition. |
Tasks | Handwritten Digit Recognition, Object Recognition, Texture Classification |
Published | 2019-05-02 |
URL | https://arxiv.org/abs/1905.00773v1 |
https://arxiv.org/pdf/1905.00773v1.pdf | |
PWC | https://paperswithcode.com/paper/clustering-images-by-unmasking-a-new-baseline |
Repo | |
Framework | |
PDQ & TMK + PDQF – A Test Drive of Facebook’s Perceptual Hashing Algorithms
Title | PDQ & TMK + PDQF – A Test Drive of Facebook’s Perceptual Hashing Algorithms |
Authors | Janis Dalins, Campbell Wilson, Douglas Boudry |
Abstract | Efficient and reliable automated detection of modified image and multimedia files has long been a challenge for law enforcement, compounded by the harm caused by repeated exposure to psychologically harmful materials. In August 2019 Facebook open-sourced their PDQ and TMK + PDQF algorithms for image and video similarity measurement, respectively. In this report, we review the algorithms’ performance on detecting commonly encountered transformations on real-world case data, sourced from contemporary investigations. We also provide a reference implementation to demonstrate the potential application and integration of such algorithms within existing law enforcement systems. |
Tasks | Video Similarity |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07745v1 |
https://arxiv.org/pdf/1912.07745v1.pdf | |
PWC | https://paperswithcode.com/paper/pdq-tmk-pdqf-a-test-drive-of-facebooks |
Repo | |
Framework | |
Learning Likelihoods with Conditional Normalizing Flows
Title | Learning Likelihoods with Conditional Normalizing Flows |
Authors | Christina Winkler, Daniel Worrall, Emiel Hoogeboom, Max Welling |
Abstract | Normalizing Flows (NFs) are able to model complicated distributions p(y) with strong inter-dimensional correlations and high multimodality by transforming a simple base density p(z) through an invertible neural network under the change of variables formula. Such behavior is desirable in multivariate structured prediction tasks, where handcrafted per-pixel loss-based methods inadequately capture strong correlations between output dimensions. We present a study of conditional normalizing flows (CNFs), a class of NFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(yx). CNFs are efficient in sampling and inference, they can be trained with a likelihood-based objective, and CNFs, being generative flows, do not suffer from mode collapse or training instabilities. We provide an effective method to train continuous CNFs for binary problems and in particular, we apply these CNFs to super-resolution and vessel segmentation tasks demonstrating competitive performance on standard benchmark datasets in terms of likelihood and conventional metrics. |
Tasks | Structured Prediction, Super-Resolution |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1912.00042v1 |
https://arxiv.org/pdf/1912.00042v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-likelihoods-with-conditional-1 |
Repo | |
Framework | |
Detect Toxic Content to Improve Online Conversations
Title | Detect Toxic Content to Improve Online Conversations |
Authors | Deepshi Mediratta, Nikhil Oswal |
Abstract | Social media is filled with toxic content. The aim of this paper is to build a model that can detect insincere questions. We use the ‘Quora Insincere Questions Classification’ dataset for our analysis. The dataset is composed of sincere and insincere questions, with the majority of sincere questions. The dataset is processed and analyzed using Python and its libraries such as sklearn, numpy, pandas, keras etc. The dataset is converted to vector form using word embeddings such as GloVe, Wiki-news and TF-IDF. The imbalance in the dataset is handled by resampling techniques. We train and compare various machine learning and deep learning models to come up with the best results. Models discussed include SVM, Naive Bayes, GRU and LSTM. |
Tasks | Word Embeddings |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1911.01217v1 |
https://arxiv.org/pdf/1911.01217v1.pdf | |
PWC | https://paperswithcode.com/paper/detect-toxic-content-to-improve-online |
Repo | |
Framework | |
A Bayesian Approach to Recurrence in Neural Networks
Title | A Bayesian Approach to Recurrence in Neural Networks |
Authors | Philip N. Garner, Sibo Tong |
Abstract | We begin by reiterating that common neural network activation functions have simple Bayesian origins. In this spirit, we go on to show that Bayes’s theorem also implies a simple recurrence relation; this leads to a Bayesian recurrent unit with a prescribed feedback formulation. We show that introduction of a context indicator leads to a variable feedback that is similar to the forget mechanism in conventional recurrent units. A similar approach leads to a probabilistic input gate. The Bayesian formulation leads naturally to the two pass algorithm of the Kalman smoother or forward-backward algorithm, meaning that inference naturally depends upon future inputs as well as past ones. Experiments on speech recognition confirm that the resulting architecture can perform as well as a bidirectional recurrent network with the same number of parameters as a unidirectional one. Further, when configured explicitly bidirectionally, the architecture can exceed the performance of a conventional bidirectional recurrence. |
Tasks | Speech Recognition |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.11247v2 |
https://arxiv.org/pdf/1910.11247v2.pdf | |
PWC | https://paperswithcode.com/paper/a-bayesian-approach-to-recurrence-in-neural |
Repo | |
Framework | |
Solving dynamic multi-objective optimization problems via support vector machine
Title | Solving dynamic multi-objective optimization problems via support vector machine |
Authors | Min Jiang, Weizhen Hu, Liming Qiu, Minghui Shi, Kay Chen Tan |
Abstract | Dynamic Multi-objective Optimization Problems (DMOPs) refer to optimization problems that objective functions will change with time. Solving DMOPs implies that the Pareto Optimal Set (POS) at different moments can be accurately found, and this is a very difficult job due to the dynamics of the optimization problems. The POS that have been obtained in the past can help us to find the POS of the next time more quickly and accurately. Therefore, in this paper we present a Support Vector Machine (SVM) based Dynamic Multi-Objective Evolutionary optimization Algorithm, called SVM-DMOEA. The algorithm uses the POS that has been obtained to train a SVM and then take the trained SVM to classify the solutions of the dynamic optimization problem at the next moment, and thus it is able to generate an initial population which consists of different individuals recognized by the trained SVM. The initial populuation can be fed into any population based optimization algorithm, e.g., the Nondominated Sorting Genetic Algorithm II (NSGA-II), to get the POS at that moment. The experimental results show the validity of our proposed approach. |
Tasks | |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.08747v1 |
https://arxiv.org/pdf/1910.08747v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-dynamic-multi-objective-optimization-1 |
Repo | |
Framework | |
A Method to Model Conditional Distributions with Normalizing Flows
Title | A Method to Model Conditional Distributions with Normalizing Flows |
Authors | Zhisheng Xiao, Qing Yan, Yali Amit |
Abstract | In this work, we investigate the use of normalizing flows to model conditional distributions. In particular, we use our proposed method to analyze inverse problems with invertible neural networks by maximizing the posterior likelihood. Our method uses only a single loss and is easy to train. This is an improvement on the previous method that solves similar inverse problems with invertible neural networks but which involves a combination of several loss terms with ad-hoc weighting. In addition, our method provides a natural framework to incorporate conditioning in normalizing flows, and therefore, we can train an invertible network to perform conditional generation. We analyze our method and perform a careful comparison with previous approaches. Simple experiments show the effectiveness of our method, and more comprehensive experimental evaluations are undergoing. |
Tasks | |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.02052v1 |
https://arxiv.org/pdf/1911.02052v1.pdf | |
PWC | https://paperswithcode.com/paper/a-method-to-model-conditional-distributions |
Repo | |
Framework | |