Paper Group ANR 426
Superpixel Segmentation with Fully Convolutional Networks. A Robust Real-Time Computing-based Environment Sensing System for Intelligent Vehicle. Three Approaches for Personalization with Applications to Federated Learning. Finnish Language Modeling with Deep Transformer Models. A General Approach for Using Deep Neural Network for Digital Watermark …
Superpixel Segmentation with Fully Convolutional Networks
Title | Superpixel Segmentation with Fully Convolutional Networks |
Authors | Fengting Yang, Qian Sun, Hailin Jin, Zihan Zhou |
Abstract | In computer vision, superpixels have been widely used as an effective way to reduce the number of image primitives for subsequent processing. But only a few attempts have been made to incorporate them into deep neural networks. One main reason is that the standard convolution operation is defined on regular grids and becomes inefficient when applied to superpixels. Inspired by an initialization strategy commonly adopted by traditional superpixel algorithms, we present a novel method that employs a simple fully convolutional network to predict superpixels on a regular image grid. Experimental results on benchmark datasets show that our method achieves state-of-the-art superpixel segmentation performance while running at about 50fps. Based on the predicted superpixels, we further develop a downsampling/upsampling scheme for deep networks with the goal of generating high-resolution outputs for dense prediction tasks. Specifically, we modify a popular network architecture for stereo matching to simultaneously predict superpixels and disparities. We show that improved disparity estimation accuracy can be obtained on public datasets. |
Tasks | Disparity Estimation, Stereo Matching |
Published | 2020-03-29 |
URL | https://arxiv.org/abs/2003.12929v1 |
https://arxiv.org/pdf/2003.12929v1.pdf | |
PWC | https://paperswithcode.com/paper/superpixel-segmentation-with-fully |
Repo | |
Framework | |
A Robust Real-Time Computing-based Environment Sensing System for Intelligent Vehicle
Title | A Robust Real-Time Computing-based Environment Sensing System for Intelligent Vehicle |
Authors | Qiwei Xie, Qian Long, Liming Zhang, Zhao Sun |
Abstract | For intelligent vehicles, sensing the 3D environment is the first but crucial step. In this paper, we build a real-time advanced driver assistance system based on a low-power mobile platform. The system is a real-time multi-scheme integrated innovation system, which combines stereo matching algorithm with machine learning based obstacle detection approach and takes advantage of the distributed computing technology of a mobile platform with GPU and CPUs. First of all, a multi-scale fast MPV (Multi-Path-Viterbi) stereo matching algorithm is proposed, which can generate robust and accurate disparity map. Then a machine learning, which is based on fusion technology of monocular and binocular, is applied to detect the obstacles. We also advance an automatic fast calibration mechanism based on Zhang’s calibration method. Finally, the distributed computing and reasonable data flow programming are applied to ensure the operational efficiency of the system. The experimental results show that the system can achieve robust and accurate real-time environment perception for intelligent vehicles, which can be directly used in the commercial real-time intelligent driving applications. |
Tasks | Calibration, Stereo Matching |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09678v1 |
https://arxiv.org/pdf/2001.09678v1.pdf | |
PWC | https://paperswithcode.com/paper/a-robust-real-time-computing-based |
Repo | |
Framework | |
Three Approaches for Personalization with Applications to Federated Learning
Title | Three Approaches for Personalization with Applications to Federated Learning |
Authors | Yishay Mansour, Mehryar Mohri, Jae Ro, Ananda Theertha Suresh |
Abstract | The standard objective in machine learning is to train a single model for all users. However, in many learning scenarios, such as cloud computing and federated learning, it is possible to learn one personalized model per user. In this work, we present a systematic learning-theoretic study of personalization. We propose and analyze three approaches: user clustering, data interpolation, and model interpolation. For all three approaches, we provide learning-theoretic guarantees and efficient algorithms for which we also demonstrate the performance empirically. All of our algorithms are model agnostic and work for any hypothesis class. |
Tasks | |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10619v1 |
https://arxiv.org/pdf/2002.10619v1.pdf | |
PWC | https://paperswithcode.com/paper/three-approaches-for-personalization-with |
Repo | |
Framework | |
Finnish Language Modeling with Deep Transformer Models
Title | Finnish Language Modeling with Deep Transformer Models |
Authors | Abhilash Jain, Aku Ruohe, Stig-Arne Grönroos, Mikko Kurimo |
Abstract | Transformers have recently taken the center stage in language modeling after LSTM’s were considered the dominant model architecture for a long time. In this project, we investigate the performance of the Transformer architectures-BERT and Transformer-XL for the language modeling task. We use a sub-word model setting with the Finnish language and compare it to the previous State of the art (SOTA) LSTM model. BERT achieves a pseudo-perplexity score of 14.5, which is the first such measure achieved as far as we know. Transformer-XL improves upon the perplexity score to 73.58 which is 27% better than the LSTM model. |
Tasks | Language Modelling |
Published | 2020-03-14 |
URL | https://arxiv.org/abs/2003.11562v2 |
https://arxiv.org/pdf/2003.11562v2.pdf | |
PWC | https://paperswithcode.com/paper/finnish-language-modeling-with-deep |
Repo | |
Framework | |
A General Approach for Using Deep Neural Network for Digital Watermarking
Title | A General Approach for Using Deep Neural Network for Digital Watermarking |
Authors | Yurui Ming, Weiping Ding, Zehong Cao, Chin-Teng Lin |
Abstract | Technologies of the Internet of Things (IoT) facilitate digital contents such as images being acquired in a massive way. However, consideration from the privacy or legislation perspective still demands the need for intellectual content protection. In this paper, we propose a general deep neural network (DNN) based watermarking method to fulfill this goal. Instead of training a neural network for protecting a specific image, we train on an image set and use the trained model to protect a distinct test image set in a bulk manner. Respective evaluations both from the subjective and objective aspects confirm the supremacy and practicability of our proposed method. To demonstrate the robustness of this general neural watermarking mechanism, commonly used manipulations are applied to the watermarked image to examine the corresponding extracted watermark, which still retains sufficient recognizable traits. To the best of our knowledge, we are the first to propose a general way to perform watermarking using DNN. Considering its performance and economy, it is concluded that subsequent studies that generalize our work on utilizing DNN for intellectual content protection is a promising research trend. |
Tasks | |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.12428v1 |
https://arxiv.org/pdf/2003.12428v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-approach-for-using-deep-neural |
Repo | |
Framework | |
A Financial Service Chatbot based on Deep Bidirectional Transformers
Title | A Financial Service Chatbot based on Deep Bidirectional Transformers |
Authors | Shi Yu, Yuxin Chen, Hussain Zaidi |
Abstract | We develop a chatbot using Deep Bidirectional Transformer models (BERT) to handle client questions in financial investment customer service. The bot can recognize 381 intents, and decides when to say “I don’t know” and escalates irrelevant/uncertain questions to human operators. Our main novel contribution is the discussion about uncertainty measure for BERT, where three different approaches are systematically compared on real problems. We investigated two uncertainty metrics, information entropy and variance of dropout sampling in BERT, followed by mixed-integer programming to optimize decision thresholds. Another novel contribution is the usage of BERT as a language model in automatic spelling correction. Inputs with accidental spelling errors can significantly decrease intent classification performance. The proposed approach combines probabilities from masked language model and word edit distances to find the best corrections for misspelled words. The chatbot and the entire conversational AI system are developed using open-source tools, and deployed within our company’s intranet. The proposed approach can be useful for industries seeking similar in-house solutions in their specific business domains. We share all our code and a sample chatbot built on a public dataset on Github. |
Tasks | Chatbot, Intent Classification, Language Modelling, Spelling Correction |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2003.04987v1 |
https://arxiv.org/pdf/2003.04987v1.pdf | |
PWC | https://paperswithcode.com/paper/a-financial-service-chatbot-based-on-deep |
Repo | |
Framework | |
What Would You Ask the Machine Learning Model? Identification of User Needs for Model Explanations Based on Human-Model Conversations
Title | What Would You Ask the Machine Learning Model? Identification of User Needs for Model Explanations Based on Human-Model Conversations |
Authors | Michał Kuźba, Przemysław Biecek |
Abstract | Recently we see a rising number of methods in the field of eXplainable Artificial Intelligence. To our surprise, their development is driven by model developers rather than a study of needs for human end users. To answer the question “What would a human operator like to ask the ML model?” we propose a conversational system explaining decisions of the predictive model. In this experiment, we implement a chatbot called dr_ant and train a model predicting survival odds on Titanic. People can talk to dr_ant about the model to understand the rationale behind its predictions. Having collected a corpus of 1000+ dialogues, we analyse the most common types of questions that users would like to ask. To our knowledge, it is the first study of needs for human operators in the context of conversations with an ML model. It is also a first study which uses a conversational system for interactive exploration of a predictive model trained on tabular data. |
Tasks | Chatbot |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.05674v1 |
https://arxiv.org/pdf/2002.05674v1.pdf | |
PWC | https://paperswithcode.com/paper/what-would-you-ask-the-machine-learning-model |
Repo | |
Framework | |
Tree Index: A New Cluster Evaluation Technique
Title | Tree Index: A New Cluster Evaluation Technique |
Authors | A. H. Beg, Md Zahidul Islam, Vladimir Estivill-Castro |
Abstract | We introduce a cluster evaluation technique called Tree Index. Our Tree Index algorithm aims at describing the structural information of the clustering rather than the quantitative format of cluster-quality indexes (where the representation power of clustering is some cumulative error similar to vector quantization). Our Tree Index is finding margins amongst clusters for easy learning without the complications of Minimum Description Length. Our Tree Index produces a decision tree from the clustered data set, using the cluster identifiers as labels. It combines the entropy of each leaf with their depth. Intuitively, a shorter tree with pure leaves generalizes the data well (the clusters are easy to learn because they are well separated). So, the labels are meaningful clusters. If the clustering algorithm does not separate well, trees learned from their results will be large and too detailed. We show that, on the clustering results (obtained by various techniques) on a brain dataset, Tree Index discriminates between reasonable and non-sensible clusters. We confirm the effectiveness of Tree Index through graphical visualizations. Tree Index evaluates the sensible solutions higher than the non-sensible solutions while existing cluster-quality indexes fail to do so. |
Tasks | Quantization |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.10841v1 |
https://arxiv.org/pdf/2003.10841v1.pdf | |
PWC | https://paperswithcode.com/paper/tree-index-a-new-cluster-evaluation-technique |
Repo | |
Framework | |
Generator From Edges: Reconstruction of Facial Images
Title | Generator From Edges: Reconstruction of Facial Images |
Authors | Nao Takano, Gita Alaghband |
Abstract | Applications that involve supervised training require paired images. Researchers of single image super-resolution (SISR) create such images by artificially generating blurry input images from the corresponding ground truth. Similarly we can create paired images with the canny edge. We propose Generator From Edges (GFE) [Figure 2]. Our aim is to determine the best architecture for GFE, along with reviews of perceptual loss [1, 2]. To this end, we conducted three experiments. First, we explored the effects of the adversarial loss often used in SISR. In particular, we uncovered that it is not an essential component to form a perceptual loss. Eliminating adversarial loss will lead to a more effective architecture from the perspective of hardware resource. It also means that considerations for the problems pertaining to generative adversarial network (GAN) [3], such as mode collapse, are not necessary. Second, we reexamined VGG loss and found that the mid-layers yield the best results. By extracting the full potential of VGG loss, the overall performance of perceptual loss improves significantly. Third, based on the findings of the first two experiments, we reevaluated the dense network to construct GFE. Using GFE as an intermediate process, reconstructing a facial image from a pencil sketch can become an easy task. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06682v2 |
https://arxiv.org/pdf/2002.06682v2.pdf | |
PWC | https://paperswithcode.com/paper/generator-from-edges-reconstruction-of-facial |
Repo | |
Framework | |
Automated extraction of mutual independence patterns using Bayesian comparison of partition models
Title | Automated extraction of mutual independence patterns using Bayesian comparison of partition models |
Authors | Guillaume Marrelec, Alain Giron |
Abstract | Mutual independence is a key concept in statistics that characterizes the structural relationships between variables. Existing methods to investigate mutual independence rely on the definition of two competing models, one being nested into the other and used to generate a null distribution for a statistic of interest, usually under the asymptotic assumption of large sample size. As such, these methods have a very restricted scope of application. In the present manuscript, we propose to change the investigation of mutual independence from a hypothesis-driven task that can only be applied in very specific cases to a blind and automated search within patterns of mutual independence. To this end, we treat the issue as one of model comparison that we solve in a Bayesian framework. We show the relationship between such an approach and existing methods in the case of multivariate normal distributions as well as cross-classified multinomial distributions. We propose a general Markov chain Monte Carlo (MCMC) algorithm to numerically approximate the posterior distribution on the space of all patterns of mutual independence. The relevance of the method is demonstrated on synthetic data as well as two real datasets, showing the unique insight provided by this approach. |
Tasks | |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05407v1 |
https://arxiv.org/pdf/2001.05407v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-extraction-of-mutual-independence |
Repo | |
Framework | |
Virtual KITTI 2
Title | Virtual KITTI 2 |
Authors | Yohann Cabon, Naila Murray, Martin Humenberger |
Abstract | This paper introduces an updated version of the well-known Virtual KITTI dataset which consists of 5 sequence clones from the KITTI tracking benchmark. In addition, the dataset provides different variants of these sequences such as modified weather conditions (e.g. fog, rain) or modified camera configurations (e.g. rotated by 15 degrees). For each sequence, we provide multiple sets of images containing RGB, depth, class segmentation, instance segmentation, flow, and scene flow data. Camera parameters and poses as well as vehicle locations are available as well. In order to showcase some of the dataset’s capabilities, we ran multiple relevant experiments using state-of-the-art algorithms from the field of autonomous driving. The dataset is available for download at https://europe.naverlabs.com/Research/Computer-Vision/Proxy-Virtual-Worlds. |
Tasks | Autonomous Driving, Instance Segmentation, Semantic Segmentation |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2001.10773v1 |
https://arxiv.org/pdf/2001.10773v1.pdf | |
PWC | https://paperswithcode.com/paper/virtual-kitti-2 |
Repo | |
Framework | |
FQuAD: French Question Answering Dataset
Title | FQuAD: French Question Answering Dataset |
Authors | Martin d’Hoffschmidt, Maxime Vidal, Wacim Belblidia, Tom Brendlé |
Abstract | Recent advances in the field of language modeling have improved state-of-the-art results on many Natural Language Processing tasks. Among them, the Machine Reading Comprehension task has made significant progress. However, most of the results are reported in English since labeled resources available in other languages, such as French, remain scarce. In the present work, we introduce the French Question Answering Dataset (FQuAD). FQuAD is French Native Reading Comprehension dataset that consists of 25,000+ questions on a set of Wikipedia articles. A baseline model is trained which achieves an F1 score of 88.0% and an exact match ratio of 77.9% on the test set. The dataset is made freely available at https://fquad.illuin.tech. |
Tasks | Language Modelling, Machine Reading Comprehension, Question Answering, Reading Comprehension |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.06071v1 |
https://arxiv.org/pdf/2002.06071v1.pdf | |
PWC | https://paperswithcode.com/paper/fquad-french-question-answering-dataset |
Repo | |
Framework | |
A Boolean Task Algebra for Reinforcement Learning
Title | A Boolean Task Algebra for Reinforcement Learning |
Authors | Geraud Nangue Tasse, Steven James, Benjamin Rosman |
Abstract | We propose a framework for defining a Boolean algebra over the space of tasks. This allows us to formulate new tasks in terms of the negation, disjunction and conjunction of a set of base tasks. We then show that by learning goal-oriented value functions and restricting the transition dynamics of the tasks, an agent can solve these new tasks with no further learning. We prove that by composing these value functions in specific ways, we immediately recover the optimal policies for all tasks expressible under the Boolean algebra. We verify our approach in two domains, including a high-dimensional video game environment requiring function approximation, where an agent first learns a set of base skills, and then composes them to solve a super-exponential number of new tasks. |
Tasks | |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01394v1 |
https://arxiv.org/pdf/2001.01394v1.pdf | |
PWC | https://paperswithcode.com/paper/a-boolean-task-algebra-for-reinforcement-1 |
Repo | |
Framework | |
A Corpus of Adpositional Supersenses for Mandarin Chinese
Title | A Corpus of Adpositional Supersenses for Mandarin Chinese |
Authors | Siyao Peng, Yang Liu, Yilun Zhu, Austin Blodgett, Yushi Zhao, Nathan Schneider |
Abstract | Adpositions are frequent markers of semantic relations, but they are highly ambiguous and vary significantly from language to language. Moreover, there is a dearth of annotated corpora for investigating the cross-linguistic variation of adposition semantics, or for building multilingual disambiguation systems. This paper presents a corpus in which all adpositions have been semantically annotated in Mandarin Chinese; to the best of our knowledge, this is the first Chinese corpus to be broadly annotated with adposition semantics. Our approach adapts a framework that defined a general set of supersenses according to ostensibly language-independent semantic criteria, though its development focused primarily on English prepositions (Schneider et al., 2018). We find that the supersense categories are well-suited to Chinese adpositions despite syntactic differences from English. On a Mandarin translation of The Little Prince, we achieve high inter-annotator agreement and analyze semantic correspondences of adposition tokens in bitext. |
Tasks | |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08437v1 |
https://arxiv.org/pdf/2003.08437v1.pdf | |
PWC | https://paperswithcode.com/paper/a-corpus-of-adpositional-supersenses-for |
Repo | |
Framework | |
Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion
Title | Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion |
Authors | Adam Kortylewski, Ju He, Qing Liu, Alan Yuille |
Abstract | Recent work has shown that deep convolutional neural networks (DCNNs) do not generalize well under partial occlusion. Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model with innate robustness to partial occlusion. We term this architecture Compositional Convolutional Neural Network. In particular, we propose to replace the fully connected classification head of a DCNN with a differentiable compositional model. The generative nature of the compositional model enables it to localize occluders and subsequently focus on the non-occluded parts of the object. We conduct classification experiments on artificially occluded images as well as real images of partially occluded objects from the MS-COCO dataset. The results show that DCNNs do not classify occluded objects robustly, even when trained with data that is strongly augmented with partial occlusions. Our proposed model outperforms standard DCNNs by a large margin at classifying partially occluded objects, even when it has not been exposed to occluded objects during training. Additional experiments demonstrate that CompositionalNets can also localize the occluders accurately, despite being trained with class labels only. |
Tasks | |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.04490v1 |
https://arxiv.org/pdf/2003.04490v1.pdf | |
PWC | https://paperswithcode.com/paper/compositional-convolutional-neural-networks-a |
Repo | |
Framework | |