January 30, 2020

2843 words 14 mins read

Paper Group ANR 266

Paper Group ANR 266

Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation. The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation. Approximating probabilistic models as weighted finite automata. CAESAR source finder: recent developments and testing. Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PA …

Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation

Title Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation
Authors Yasushi Kawase, Yuko Kuroki, Atsushi Miyauchi
Abstract Aggregating responses from crowd workers is a fundamental task in the process of crowdsourcing. In cases where a few experts are overwhelmed by a large number of non-experts, most answer aggregation algorithms such as the majority voting fail to identify the correct answers. Therefore, it is crucial to extract reliable experts from the crowd workers. In this study, we introduce the notion of “expert core”, which is a set of workers that is very unlikely to contain a non-expert. We design a graph-mining-based efficient algorithm that exactly computes the expert core. To answer the aggregation task, we propose two types of algorithms. The first one incorporates the expert core into existing answer aggregation algorithms such as the majority voting, whereas the second one utilizes information provided by the expert core extraction algorithm pertaining to the reliability of workers. We then give a theoretical justification for the first type of algorithm. Computational experiments using synthetic and real-world datasets demonstrate that our proposed answer aggregation algorithms outperform state-of-the-art algorithms.
Tasks
Published 2019-05-17
URL https://arxiv.org/abs/1905.08088v1
PDF https://arxiv.org/pdf/1905.08088v1.pdf
PWC https://paperswithcode.com/paper/graph-mining-meets-crowdsourcing-extracting
Repo
Framework

The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation

Title The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation
Authors Richard Yuanzhe Pang
Abstract The difficulty of textual style transfer lies in the lack of parallel corpora. Numerous advances have been proposed for the unsupervised generation. However, significant problems remain with the auto-evaluation of style transfer tasks. Based on the summary of Pang and Gimpel (2018) and Mir et al. (2019), style transfer evaluations rely on three criteria: style accuracy of transferred sentences, content similarity between original and transferred sentences, and fluency of transferred sentences. We elucidate the problematic current state of style transfer research. Given that current tasks do not represent real use cases of style transfer, current auto-evaluation approach is flawed. This discussion aims to bring researchers to think about the future of style transfer and style transfer evaluation research.
Tasks Style Transfer
Published 2019-10-09
URL https://arxiv.org/abs/1910.03747v2
PDF https://arxiv.org/pdf/1910.03747v2.pdf
PWC https://paperswithcode.com/paper/the-daunting-task-of-real-world-textual-style
Repo
Framework

Approximating probabilistic models as weighted finite automata

Title Approximating probabilistic models as weighted finite automata
Authors Ananda Theertha Suresh, Brian Roark, Michael Riley, Vlad Schogol
Abstract Weighted finite automata (WFA) are often used to represent probabilistic models, such as $n$-gram language models, since they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a weighted finite automaton such that the Kullback-Leiber divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization step, both of which can be performed efficiently. We demonstrate the usefulness of our approach on various tasks, including distilling $n$-gram models from neural models, building compact language models, and building open-vocabulary character models. The algorithms used for these experiments are available in an open-source software library.
Tasks
Published 2019-05-21
URL https://arxiv.org/abs/1905.08701v2
PDF https://arxiv.org/pdf/1905.08701v2.pdf
PWC https://paperswithcode.com/paper/approximating-probabilistic-models-as
Repo
Framework

CAESAR source finder: recent developments and testing

Title CAESAR source finder: recent developments and testing
Authors S. Riggi, F. Vitello, U. Becciani, C. Buemi, F. Bufano, A. Calanducci, F. Cavallaro, A. Costa, A. Ingallinera, P. Leto, S. Loru, R. P. Norris, F. Schillirò, E. Sciacca, C. Trigilio, G. Umana
Abstract A new era in radioastronomy will begin with the upcoming large-scale surveys planned at the Australian Square Kilometre Array Pathfinder (ASKAP). ASKAP started its Early Science program in October 2017 and several target fields were observed during the array commissioning phase. The SCORPIO field was the first observed in the Galactic Plane in Band 1 (792-1032 MHz) using 15 commissioned antennas. The achieved sensitivity and large field of view already allow to discover new sources and survey thousands of existing ones with improved precision with respect to previous surveys. Data analysis is currently ongoing to deliver the first source catalogue. Given the increased scale of the data, source extraction and characterization, even in this Early Science phase, have to be carried out in a mostly automated way. This process presents significant challenges due to the presence of extended objects and diffuse emission close to the Galactic Plane. In this context we have extended and optimized a novel source finding tool, named CAESAR , to allow extraction of both compact and extended sources from radio maps. A number of developments have been done driven by the analysis of the SCORPIO map and in view of the future ASKAP Galactic Plane survey. The main goals are the improvement of algorithm performances and scalability as well as of software maintainability and usability within the radio community. In this paper we present the current status of CAESAR and report a first systematic characterization of its performance for both compact and extended sources using simulated maps. Future prospects are discussed in light of the obtained results.
Tasks
Published 2019-09-13
URL https://arxiv.org/abs/1909.06116v1
PDF https://arxiv.org/pdf/1909.06116v1.pdf
PWC https://paperswithcode.com/paper/caesar-source-finder-recent-developments-and
Repo
Framework

Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PAD Systems

Title Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PAD Systems
Authors Giulia Orrù, Pierluigi Tuveri, Luca Ghiani, Gian Luca Marcialis
Abstract Fingerprint Liveness detection, or presentation attacks detection (PAD), that is, the ability of detecting if a fingerprint submitted to an electronic capture device is authentic or made up of some artificial materials, boosted the attention of the scientific community and recently machine learning approaches based on deep networks opened novel scenarios. A significant step ahead was due thanks to the public availability of large sets of data; in particular, the ones released during the International Fingerprint Liveness Detection Competition (LivDet). Among others, the fifth edition carried on in 2017, challenged the participants in two more challenges which were not detailed in the official report. In this paper, we want to extend that report by focusing on them: the first one was aimed at exploring the case in which the PAD is integrated into a fingerprint verification systems, where templates of users are available too and the designer is not constrained to refer only to a generic users population for the PAD settings. The second one faces with the exploitation ability of attackers of the provided fakes, and how this ability impacts on the final performance. These two challenges together may set at which extent the fingerprint presentation attacks are an actual threat and how to exploit additional information to make the PAD more effective.
Tasks
Published 2019-07-18
URL https://arxiv.org/abs/1907.08068v1
PDF https://arxiv.org/pdf/1907.08068v1.pdf
PWC https://paperswithcode.com/paper/analysis-of-user-specific-effect-and-impact
Repo
Framework

Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC

Title Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC
Authors Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Thomas Joy, Luigi Di Stefano, Simon Walker, Philip H. S. Torr
Abstract Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of the more advanced algorithms, e.g. those that rely on unpredictable access to memory or are highly iterative in nature, are difficult to deploy efficiently on FPGAs, and thus the depth quality that can be achieved is limited. In this paper, we leverage a FPGA-CPU chip to propose a novel, sophisticated, stereo approach that combines the best features of SGM and ELAS-based methods to compute highly accurate dense depth in real time. Our approach achieves an 8.7% error rate on the challenging KITTI 2015 dataset at over 50 FPS, with a power consumption of only 5W.
Tasks
Published 2019-07-17
URL https://arxiv.org/abs/1907.07745v1
PDF https://arxiv.org/pdf/1907.07745v1.pdf
PWC https://paperswithcode.com/paper/real-time-highly-accurate-dense-depth-on-a
Repo
Framework

Machine Learning for Precipitation Nowcasting from Radar Images

Title Machine Learning for Precipitation Nowcasting from Radar Images
Authors Shreya Agrawal, Luke Barrington, Carla Bromberg, John Burge, Cenk Gazen, Jason Hickey
Abstract High-resolution nowcasting is an essential tool needed for effective adaptation to climate change, particularly for extreme weather. As Deep Learning (DL) techniques have shown dramatic promise in many domains, including the geosciences, we present an application of DL to the problem of precipitation nowcasting, i.e., high-resolution (1 km x 1 km) short-term (1 hour) predictions of precipitation. We treat forecasting as an image-to-image translation problem and leverage the power of the ubiquitous UNET convolutional neural network. We find this performs favorably when compared to three commonly used models: optical flow, persistence and NOAA’s numerical one-hour HRRR nowcasting prediction.
Tasks Image-to-Image Translation, Optical Flow Estimation
Published 2019-12-11
URL https://arxiv.org/abs/1912.12132v1
PDF https://arxiv.org/pdf/1912.12132v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-precipitation-nowcasting
Repo
Framework

Representation Learning with Multisets

Title Representation Learning with Multisets
Authors Vasco Portilheiro
Abstract We study the problem of learning permutation invariant representations that can capture “flexible” notions of containment. We formalize this problem via a measure theoretic definition of multisets, and obtain a theoretically-motivated learning model. We propose training this model on a novel task: predicting the size of the symmetric difference (or intersection) between pairs of multisets. We demonstrate that our model not only performs very well on predicting containment relations (and more effectively predicts the sizes of symmetric differences and intersections than DeepSets-based approaches with unconstrained object representations), but that it also learns meaningful representations.
Tasks Representation Learning
Published 2019-11-19
URL https://arxiv.org/abs/1911.08577v1
PDF https://arxiv.org/pdf/1911.08577v1.pdf
PWC https://paperswithcode.com/paper/representation-learning-with-multisets-1
Repo
Framework

Cross-modal supervised learning for better acoustic representations

Title Cross-modal supervised learning for better acoustic representations
Authors Shaoyong Jia, Xin Shu, Yang Yang, Dawei Liang, Qiyue Liu, Junhui Liu
Abstract Obtaining large-scale human-labeled datasets to train acoustic representation models is a very challenging task. On the contrary, we can easily collect data with machine-generated labels. In this work, we propose to exploit machine-generated labels to learn better acoustic representations, based on the synchronization between vision and audio. Firstly, we collect a large-scale video dataset with 15 million samples, which totally last 16,320 hours. Each video is 3 to 5 seconds in length and annotated automatically by publicly available visual and audio classification models. Secondly, we train various classical convolutional neural networks (CNNs) including VGGish, ResNet 50 and Mobilenet v2. We also make several improvements to VGGish and achieve better results. Finally, we transfer our models on three external standard benchmarks for audio classification task, and achieve significant performance boost over the state-of-the-art results. Models and codes are available at: https://github.com/Deeperjia/vgg-like-audio-models.
Tasks Audio Classification
Published 2019-11-15
URL https://arxiv.org/abs/1911.07917v2
PDF https://arxiv.org/pdf/1911.07917v2.pdf
PWC https://paperswithcode.com/paper/cross-modal-supervised-learning-for-better
Repo
Framework

Model-Agnostic Counterfactual Explanations for Consequential Decisions

Title Model-Agnostic Counterfactual Explanations for Consequential Decisions
Authors Amir-Hossein Karimi, Gilles Barthe, Borja Balle, Isabel Valera
Abstract Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide explanations that help the affected individuals not only to understand why a prediction was output, but also how to act to obtain a desired outcome. To this end, several works have proposed optimization-based methods to generate nearest counterfactual explanations. However, these methods are often restricted to a particular subset of models (e.g., decision trees or linear models) and differentiable distance functions. In contrast, we build on standard theory and tools from formal verification and propose a novel algorithm that solves a sequence of satisfiability problems, where both the distance function (objective) and predictive model (constraints) are represented as logic formulae. As shown by our experiments on real-world data, our algorithm is: i) model-agnostic ({non-}linear, {non-}differentiable, {non-}convex); ii) data-type-agnostic (heterogeneous features); iii) distance-agnostic ($\ell_0, \ell_1, \ell_\infty$, and combinations thereof); iv) able to generate plausible and diverse counterfactuals for any sample (i.e., 100% coverage); and v) at provably optimal distances.
Tasks Decision Making
Published 2019-05-27
URL https://arxiv.org/abs/1905.11190v5
PDF https://arxiv.org/pdf/1905.11190v5.pdf
PWC https://paperswithcode.com/paper/model-agnostic-counterfactual-explanations
Repo
Framework

Personalized Music Recommendation with Triplet Network

Title Personalized Music Recommendation with Triplet Network
Authors Haoting Liang, Donghuo Zeng, Yi Yu, Keizo Oyama
Abstract Since many online music services emerged in recent years so that effective music recommendation systems are desirable. Some common problems in recommendation system like feature representations, distance measure and cold start problems are also challenges for music recommendation. In this paper, I proposed a triplet neural network, exploiting both positive and negative samples to learn the representation and distance measure between users and items, to solve the recommendation task.
Tasks Recommendation Systems
Published 2019-08-10
URL https://arxiv.org/abs/1908.03738v1
PDF https://arxiv.org/pdf/1908.03738v1.pdf
PWC https://paperswithcode.com/paper/personalized-music-recommendation-with
Repo
Framework

Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Title Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases
Authors Estelle Maudet, Oralie Cattan, Maureen de Seyssel, Christophe Servan
Abstract This paper reports on Qwant Research contribution to tasks 2 and 3 of the DEFT 2019’s challenge, focusing on French clinical cases analysis. Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have developed an information extraction system yielding very encouraging results accuracy-wise. We have experimented two different approaches, one based on the exclusive use of neural networks, the other based on a linguistic analysis.
Tasks Information Retrieval, Semantic Similarity, Semantic Textual Similarity
Published 2019-07-06
URL https://arxiv.org/abs/1907.05790v1
PDF https://arxiv.org/pdf/1907.05790v1.pdf
PWC https://paperswithcode.com/paper/qwant-research-deft-2019-document-matching
Repo
Framework

On the Role of Time in Learning

Title On the Role of Time in Learning
Authors Alessandro Betti, Marco Gori
Abstract By and large the process of learning concepts that are embedded in time is regarded as quite a mature research topic. Hidden Markov models, recurrent neural networks are, amongst others, successful approaches to learning from temporal data. In this paper, we claim that the dominant approach minimizing appropriate risk functions defined over time by classic stochastic gradient might miss the deep interpretation of time given in other fields like physics. We show that a recent reformulation of learning according to the principle of Least Cognitive Action is better suited whenever time is involved in learning. The principle gives rise to a learning process that is driven by differential equations, that can somehow descrive the process within the same framework as other laws of nature.
Tasks
Published 2019-07-14
URL https://arxiv.org/abs/1907.06198v1
PDF https://arxiv.org/pdf/1907.06198v1.pdf
PWC https://paperswithcode.com/paper/on-the-role-of-time-in-learning
Repo
Framework

An Efficient 3D CNN for Action/Object Segmentation in Video

Title An Efficient 3D CNN for Action/Object Segmentation in Video
Authors Rui Hou, Chen Chen, Rahul Sukthankar, Mubarak Shah
Abstract Convolutional Neural Network (CNN) based image segmentation has made great progress in recent years. However, video object segmentation remains a challenging task due to its high computational complexity. Most of the previous methods employ a two-stream CNN framework to handle spatial and motion features separately. In this paper, we propose an end-to-end encoder-decoder style 3D CNN to aggregate spatial and temporal information simultaneously for video object segmentation. To efficiently process video, we propose 3D separable convolution for the pyramid pooling module and decoder, which dramatically reduces the number of operations while maintaining the performance. Moreover, we also extend our framework to video action segmentation by adding an extra classifier to predict the action label for actors in videos. Extensive experiments on several video datasets demonstrate the superior performance of the proposed approach for action and object segmentation compared to the state-of-the-art.
Tasks action segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published 2019-07-21
URL https://arxiv.org/abs/1907.08895v1
PDF https://arxiv.org/pdf/1907.08895v1.pdf
PWC https://paperswithcode.com/paper/an-efficient-3d-cnn-for-actionobject
Repo
Framework

V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation

Title V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation
Authors Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis
Abstract We propose V2CNet, a new deep learning framework to automatically translate the demonstration videos to commands that can be directly used in robotic applications. Our V2CNet has two branches and aims at understanding the demonstration video in a fine-grained manner. The first branch has the encoder-decoder architecture to encode the visual features and sequentially generate the output words as a command, while the second branch uses a Temporal Convolutional Network (TCN) to learn the fine-grained actions. By jointly training both branches, the network is able to model the sequential information of the command, while effectively encodes the fine-grained actions. The experimental results on our new large-scale dataset show that V2CNet outperforms recent state-of-the-art methods by a substantial margin, while its output can be applied in real robotic applications. The source code and trained models will be made available.
Tasks
Published 2019-03-23
URL http://arxiv.org/abs/1903.10869v1
PDF http://arxiv.org/pdf/1903.10869v1.pdf
PWC https://paperswithcode.com/paper/v2cnet-a-deep-learning-framework-to-translate
Repo
Framework
comments powered by Disqus