Paper Group ANR 121
Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity. Naver Labs Europe’s Systems for the Document-Level Generation and Translation Task at WNGT 2019. Novel Long Short-Term Memory Cell Architectures: Application to Light Field Face Recognition. Automated Playtesting of Matching Tile Games. 3D human …
Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity
Title | Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity |
Authors | Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi |
Abstract | Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose Sieve-Streaming++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight $(1/2)$-approximation guarantee. The best previously known streaming algorithms either achieve a suboptimal $(1/4)$-approximation with $\Theta(k)$ memory or the optimal $(1/2)$-approximation with $O(k\log k)$ memory. Next, we show that by buffering a small fraction of the stream and applying a careful filtering procedure, one can heavily reduce the number of adaptive computational rounds, thus substantially lowering the computational complexity of Sieve-Streaming++. We then generalize our results to the more challenging multi-source streaming setting. We show how one can achieve the tight $(1/2)$-approximation guarantee with $O(k)$ shared memory while minimizing not only the required rounds of computations but also the total number of communicated bits. Finally, we demonstrate the efficiency of our algorithms on real-world data summarization tasks for multi-source streams of tweets and of YouTube videos. |
Tasks | Data Summarization |
Published | 2019-05-02 |
URL | https://arxiv.org/abs/1905.00948v2 |
https://arxiv.org/pdf/1905.00948v2.pdf | |
PWC | https://paperswithcode.com/paper/submodular-streaming-in-all-its-glory-tight |
Repo | |
Framework | |
Naver Labs Europe’s Systems for the Document-Level Generation and Translation Task at WNGT 2019
Title | Naver Labs Europe’s Systems for the Document-Level Generation and Translation Task at WNGT 2019 |
Authors | Fahimeh Saleh, Alexandre Bérard, Ioan Calapodescu, Laurent Besacier |
Abstract | Recently, neural models led to significant improvements in both machine translation (MT) and natural language generation tasks (NLG). However, generation of long descriptive summaries conditioned on structured data remains an open challenge. Likewise, MT that goes beyond sentence-level context is still an open issue (e.g., document-level MT or MT with metadata). To address these challenges, we propose to leverage data from both tasks and do transfer learning between MT, NLG, and MT with source-side metadata (MT+NLG). First, we train document-based MT systems with large amounts of parallel data. Then, we adapt these models to pure NLG and MT+NLG tasks by fine-tuning with smaller amounts of domain-specific data. This end-to-end NLG approach, without data selection and planning, outperforms the previous state of the art on the Rotowire NLG task. We participated to the “Document Generation and Translation” task at WNGT 2019, and ranked first in all tracks. |
Tasks | Machine Translation, Text Generation, Transfer Learning |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14539v1 |
https://arxiv.org/pdf/1910.14539v1.pdf | |
PWC | https://paperswithcode.com/paper/naver-labs-europes-systems-for-the-document |
Repo | |
Framework | |
Novel Long Short-Term Memory Cell Architectures: Application to Light Field Face Recognition
Title | Novel Long Short-Term Memory Cell Architectures: Application to Light Field Face Recognition |
Authors | Alireza Sepas-Moghaddam, Fernando Pereira, Paulo Lobato Correia |
Abstract | With the emergence of lenslet light field cameras able to capture rich spatio-angular information from multiple directions, new frontiers in visual recognition performance have been opened. Since multiple 2D viewpoint images can be rendered from a light field, those multiple images, or descriptions extracted from them, can be organized as a pseudo-video sequence so that a LSTM network learns a model describing that sequence. This paper proposes three novel LSTM cell architectures able to create richer and more effective description models for visual recognition tasks, by jointly learning from two sequences simultaneously acquired. The novel key idea is to jointly process two sequences of rendered 2D images or their descriptions, e.g. representing the scene horizontal and vertical parallaxes, and thus with some specific dependency between them, that would not be exploited otherwise. To show the efficiency of the novel LSTM cell architectures, these architectures have been integrated into an end-to-end deep learning face recognition framework, which creates this join spatio-angular light field description. The LSTM network, using the proposed LSTM cell architectures, receives as input a sequence of VGG-Face descriptions computed for parallax related, horizontal and vertical 2D face viewpoint images, derived from the input light field image. A comprehensive evaluation in terms of recognition accuracy, computational complexity, memory efficiency, and parallelization ability has been performed with the IST EURECOM LFFD database using three new and challenging evaluation protocols. The obtained results show the superior performance of the proposed face recognition solutions adopting the novel LSTM cell architectures over ten state-of-the-art benchmarking recognition solutions. |
Tasks | Face Recognition |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1905.04421v1 |
https://arxiv.org/pdf/1905.04421v1.pdf | |
PWC | https://paperswithcode.com/paper/novel-long-short-term-memory-cell |
Repo | |
Framework | |
Automated Playtesting of Matching Tile Games
Title | Automated Playtesting of Matching Tile Games |
Authors | Luvneesh Mugrai, Fernando de Mesentier Silva, Christoffer Holmgård, Julian Togelius |
Abstract | Matching tile games are an extremely popular game genre. Arguably the most popular iteration, Match-3 games, are simple to understand puzzle games, making them great benchmarks for research. In this paper, we propose developing different procedural personas for Match-3 games in order to approximate different human playstyles to create an automated playtesting system. The procedural personas are realized through evolving the utility function for the Monte Carlo Tree Search agent. We compare the performance and results of the evolution agents with the standard Vanilla Monte Carlo Tree Search implementation as well as to a random move-selection agent. We then observe the impacts on both the game’s design and the game design process. Lastly, a user study is performed to compare the agents to human play traces. |
Tasks | |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06570v1 |
https://arxiv.org/pdf/1907.06570v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-playtesting-of-matching-tile-games |
Repo | |
Framework | |
3D human action analysis and recognition through GLAC descriptor on 2D motion and static posture images
Title | 3D human action analysis and recognition through GLAC descriptor on 2D motion and static posture images |
Authors | Mohammad Farhad Bulbul, Saiful Islam, Hazrat Ali |
Abstract | In this paper, we present an approach for identification of actions within depth action videos. First, we process the video to get motion history images (MHIs) and static history images (SHIs) corresponding to an action video based on the use of 3D Motion Trail Model (3DMTM). We then characterize the action video by extracting the Gradient Local Auto-Correlations (GLAC) features from the SHIs and the MHIs. The two sets of features i.e., GLAC features from MHIs and GLAC features from SHIs are concatenated to obtain a representation vector for action. Finally, we perform the classification on all the action samples by using the l2-regularized Collaborative Representation Classifier (l2-CRC) to recognize different human actions in an effective way. We perform evaluation of the proposed method on three action datasets, MSR-Action3D, DHA and UTD-MHAD. Through experimental results, we observe that the proposed method performs superior to other approaches. |
Tasks | |
Published | 2019-03-19 |
URL | http://arxiv.org/abs/1904.00764v1 |
http://arxiv.org/pdf/1904.00764v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-human-action-analysis-and-recognition |
Repo | |
Framework | |
Optimal Transport Based Generative Autoencoders
Title | Optimal Transport Based Generative Autoencoders |
Authors | Oliver Zhang, Ruei-Sung Lin, Yuchuan Gou |
Abstract | The field of deep generative modeling is dominated by generative adversarial networks (GANs). However, the training of GANs often lacks stability, fails to converge, and suffers from model collapse. It takes an assortment of tricks to solve these problems, which may be difficult to understand for those seeking to apply generative modeling. Instead, we propose two novel generative autoencoders, AE-OTtrans and AE-OTgen, which rely on optimal transport instead of adversarial training. AE-OTtrans and AEOTgen, unlike VAE and WAE, preserve the manifold of the data; they do not force the latent distribution to match a normal distribution, resulting in greater quality images. AEOTtrans and AE-OTgen also produce images of higher diversity compared to their predecessor, AE-OT. We show that AE-OTtrans and AE-OTgen surpass GANs in the MNIST and FashionMNIST datasets. Furthermore, We show that AE-OTtrans and AE-OTgen do state of the art on the MNIST, FashionMNIST, and CelebA image sets comapred to other non-adversarial generative models. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07636v1 |
https://arxiv.org/pdf/1910.07636v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-transport-based-generative |
Repo | |
Framework | |
Neural Drum Machine : An Interactive System for Real-time Synthesis of Drum Sounds
Title | Neural Drum Machine : An Interactive System for Real-time Synthesis of Drum Sounds |
Authors | Cyran Aouameur, Philippe Esling, Gaëtan Hadjeres |
Abstract | In this work, we introduce a system for real-time generation of drum sounds. This system is composed of two parts: a generative model for drum sounds together with a Max4Live plugin providing intuitive controls on the generative process. The generative model consists of a Conditional Wasserstein autoencoder (CWAE), which learns to generate Mel-scaled magnitude spectrograms of short percussion samples, coupled with a Multi-Head Convolutional Neural Network (MCNN) which estimates the corresponding audio signal from the magnitude spectrogram. The design of this model makes it lightweight, so that it allows one to perform real-time generation of novel drum sounds on an average CPU, removing the need for the users to possess dedicated hardware in order to use this system. We then present our Max4Live interface designed to interact with this generative model. With this setup, the system can be easily integrated into a studio-production environment and enhance the creative process. Finally, we discuss the advantages of our system and how the interaction of music producers with such tools could change the way drum tracks are composed. |
Tasks | |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.02637v2 |
https://arxiv.org/pdf/1907.02637v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-drum-machine-an-interactive-system-for |
Repo | |
Framework | |
Output-weighted optimal sampling for Bayesian regression and rare event statistics using few samples
Title | Output-weighted optimal sampling for Bayesian regression and rare event statistics using few samples |
Authors | Themistoklis P. Sapsis |
Abstract | For many important problems the quantity of interest is an unknown function of the parameters, which is a random vector with known statistics. Since the dependence of the output on this random vector is unknown, the challenge is to identify its statistics, using the minimum number of function evaluations. This problem can been seen in the context of active learning or optimal experimental design. We employ Bayesian regression to represent the derived model uncertainty due to finite and small number of input-output pairs. In this context we evaluate existing methods for optimal sample selection, such as model error minimization and mutual information maximization. We show that for the case of known output variance, the commonly employed criteria in the literature do not take into account the output values of the existing input-output pairs, while for the case of unknown output variance this dependence can be very weak. We introduce a criterion that takes into account the values of the output for the existing samples and adaptively selects inputs from regions of the parameter space which have important contribution to the output. The new method allows for application to high-dimensional inputs, paving the way for optimal experimental design in high-dimensions. |
Tasks | Active Learning |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07552v2 |
https://arxiv.org/pdf/1907.07552v2.pdf | |
PWC | https://paperswithcode.com/paper/output-weighted-optimal-sampling-for-bayesian |
Repo | |
Framework | |
Significance Tests for Neural Networks
Title | Significance Tests for Neural Networks |
Authors | Enguerrand Horel, Kay Giesecke |
Abstract | We develop a pivotal test to assess the statistical significance of the feature variables in a single-layer feedforward neural network regression model. We propose a gradient-based test statistic and study its asymptotics using nonparametric techniques. Under technical conditions, the limiting distribution is given by a mixture of chi-square distributions. The tests enable one to discern the impact of individual variables on the prediction of a neural network. The test statistic can be used to rank variables according to their influence. Simulation results illustrate the computational efficiency and the performance of the test. An empirical application to house price valuation highlights the behavior of the test using actual data. |
Tasks | |
Published | 2019-02-16 |
URL | https://arxiv.org/abs/1902.06021v2 |
https://arxiv.org/pdf/1902.06021v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-explainable-ai-significance-tests-for |
Repo | |
Framework | |
An Evaluation of Feature Matchers for Fundamental Matrix Estimation
Title | An Evaluation of Feature Matchers for Fundamental Matrix Estimation |
Authors | Jia-Wang Bian, Yu-Huan Wu, Ji Zhao, Yun Liu, Le Zhang, Ming-Ming Cheng, Ian Reid |
Abstract | Matching two images while estimating their relative geometry is a key step in many computer vision applications. For decades, a well-established pipeline, consisting of SIFT, RANSAC, and 8-point algorithm, has been used for this task. Recently, many new approaches were proposed and shown to outperform previous alternatives on standard benchmarks, including the learned features, correspondence pruning algorithms, and robust estimators. However, whether it is beneficial to incorporate them into the classic pipeline is less-investigated. To this end, we are interested in i) evaluating the performance of these recent algorithms in the context of image matching and epipolar geometry estimation, and ii) leveraging them to design more practical registration systems. The experiments are conducted in four large-scale datasets using strictly defined evaluation metrics, and the promising results provide insight into which algorithms suit which scenarios. According to this, we propose three high-quality matching systems and a Coarse-to-Fine RANSAC estimator. They show remarkable performances and have potentials to a large part of computer vision tasks. To facilitate future research, the full evaluation pipeline and the proposed methods are made publicly available. |
Tasks | |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09474v2 |
https://arxiv.org/pdf/1908.09474v2.pdf | |
PWC | https://paperswithcode.com/paper/an-evaluation-of-feature-matchers |
Repo | |
Framework | |
Proceedings 3rd Workshop on formal reasoning about Causation, Responsibility, and Explanations in Science and Technology
Title | Proceedings 3rd Workshop on formal reasoning about Causation, Responsibility, and Explanations in Science and Technology |
Authors | Bernd Finkbeiner, Samantha Kleinberg |
Abstract | The CREST 2018 workshop is the third in a series of workshops addressing formal approaches to reasoning about causation in systems engineering. The topic of formally identifying the cause(s) of specific events - usually some form of failures -, and explaining why they occurred, are increasingly in the focus of several, disjoint communities. The main objective of CREST is to bring together researchers and practitioners from industry and academia in order to enable discussions how explicit and implicit reasoning about causation is performed. A further objective is to link to the foundations of causal reasoning in the philosophy of sciences and to causal reasoning performed in other areas of computer science, engineering, and beyond. |
Tasks | |
Published | 2019-01-01 |
URL | http://arxiv.org/abs/1901.00073v1 |
http://arxiv.org/pdf/1901.00073v1.pdf | |
PWC | https://paperswithcode.com/paper/proceedings-3rd-workshop-on-formal-reasoning |
Repo | |
Framework | |
Weakly-Supervised White and Grey Matter Segmentation in 3D Brain Ultrasound
Title | Weakly-Supervised White and Grey Matter Segmentation in 3D Brain Ultrasound |
Authors | Beatrice Demiray, Julia Rackerseder, Stevica Bozhinoski, Nassir Navab |
Abstract | Although the segmentation of brain structures in ultrasound helps initialize image based registration, assist brain shift compensation, and provides interventional decision support, the task of segmenting grey and white matter in cranial ultrasound is very challenging and has not been addressed yet. We train a multi-scale fully convolutional neural network simultaneously for two classes in order to segment real clinical 3D ultrasound data. Parallel pathways working at different levels of resolution account for high frequency speckle noise and global 3D image features. To ensure reproducibility, the publicly available RESECT dataset is utilized for training and cross-validation. Due to the absence of a ground truth, we train with weakly annotated label. We implement label transfer from MRI to US, which is prone to a residual but inevitable registration error. To further improve results, we perform transfer learning using synthetic US data. The resulting method leads to excellent Dice scores of 0.7080, 0.8402 and 0.9315 for grey matter, white matter and background. Our proposed methodology sets an unparalleled standard for white and grey matter segmentation in 3D intracranial ultrasound. |
Tasks | Transfer Learning |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05191v3 |
http://arxiv.org/pdf/1904.05191v3.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-white-and-grey-matter |
Repo | |
Framework | |
Learning to Manipulate Object Collections Using Grounded State Representations
Title | Learning to Manipulate Object Collections Using Grounded State Representations |
Authors | Matthew Wilson, Tucker Hermans |
Abstract | We propose a method for sim-to-real robot learning which exploits simulator state information in a way that scales to many objects. First, we train a pair of encoders on raw object pose targets to learn representations that accurately capture the state information of a multi-object environment. Second, we use these encoders in a reinforcement learning algorithm to train image-based policies capable of manipulating many objects. Our pair of encoders consists of a convolutional neural network (CNN) which consumes RGB images and is used in our policy network, and a graph neural network (GNN) which directly consumes a set of raw object poses and is used for reward calculation and value estimation. We evaluate our method on the task of pushing a collection of objects to desired tabletop regions. Compared to methods which rely only on images or use fixed-length state encodings, our method achieves higher success rates, performs well in the real world without fine tuning, and generalizes to different numbers and types of objects not seen during training. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07876v2 |
https://arxiv.org/pdf/1909.07876v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-manipulate-object-collections |
Repo | |
Framework | |
AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning
Title | AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning |
Authors | Han Guo, Ramakanth Pasunuru, Mohit Bansal |
Abstract | Multi-task learning (MTL) has achieved success over a wide range of problems, where the goal is to improve the performance of a primary task using a set of relevant auxiliary tasks. However, when the usefulness of the auxiliary tasks w.r.t. the primary task is not known a priori, the success of MTL models depends on the correct choice of these auxiliary tasks and also a balanced mixing ratio of these tasks during alternate training. These two problems could be resolved via manual intuition or hyper-parameter tuning over all combinatorial task choices, but this introduces inductive bias or is not scalable when the number of candidate auxiliary tasks is very large. To address these issues, we present AutoSeM, a two-stage MTL pipeline, where the first stage automatically selects the most useful auxiliary tasks via a Beta-Bernoulli multi-armed bandit with Thompson Sampling, and the second stage learns the training mixing ratio of these selected auxiliary tasks via a Gaussian Process based Bayesian optimization framework. We conduct several MTL experiments on the GLUE language understanding tasks, and show that our AutoSeM framework can successfully find relevant auxiliary tasks and automatically learn their mixing ratio, achieving significant performance boosts on several primary tasks. Finally, we present ablations for each stage of AutoSeM and analyze the learned auxiliary task choices. |
Tasks | Multi-Task Learning |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.04153v1 |
http://arxiv.org/pdf/1904.04153v1.pdf | |
PWC | https://paperswithcode.com/paper/autosem-automatic-task-selection-and-mixing |
Repo | |
Framework | |
Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools
Title | Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools |
Authors | Anh Truong, Austin Walters, Jeremy Goodsitt, Keegan Hines, C. Bayan Bruss, Reza Farivar |
Abstract | There has been considerable growth and interest in industrial applications of machine learning (ML) in recent years. ML engineers, as a consequence, are in high demand across the industry, yet improving the efficiency of ML engineers remains a fundamental challenge. Automated machine learning (AutoML) has emerged as a way to save time and effort on repetitive tasks in ML pipelines, such as data pre-processing, feature engineering, model selection, hyperparameter optimization, and prediction result analysis. In this paper, we investigate the current state of AutoML tools aiming to automate these tasks. We conduct various evaluations of the tools on many datasets, in different data segments, to examine their performance, and compare their advantages and disadvantages on different test cases. |
Tasks | AutoML, Feature Engineering, Hyperparameter Optimization, Model Selection |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.05557v2 |
https://arxiv.org/pdf/1908.05557v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-automated-machine-learning-evaluation |
Repo | |
Framework | |