January 30, 2020

2843 words 14 mins read

Paper Group ANR 266

Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation. The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation. Approximating probabilistic models as weighted finite automata. CAESAR source finder: recent developments and testing. Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PA …

Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation


Title	Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation
Authors	Yasushi Kawase, Yuko Kuroki, Atsushi Miyauchi
Abstract	Aggregating responses from crowd workers is a fundamental task in the process of crowdsourcing. In cases where a few experts are overwhelmed by a large number of non-experts, most answer aggregation algorithms such as the majority voting fail to identify the correct answers. Therefore, it is crucial to extract reliable experts from the crowd workers. In this study, we introduce the notion of “expert core”, which is a set of workers that is very unlikely to contain a non-expert. We design a graph-mining-based efficient algorithm that exactly computes the expert core. To answer the aggregation task, we propose two types of algorithms. The first one incorporates the expert core into existing answer aggregation algorithms such as the majority voting, whereas the second one utilizes information provided by the expert core extraction algorithm pertaining to the reliability of workers. We then give a theoretical justification for the first type of algorithm. Computational experiments using synthetic and real-world datasets demonstrate that our proposed answer aggregation algorithms outperform state-of-the-art algorithms.
Tasks
Published	2019-05-17
URL	https://arxiv.org/abs/1905.08088v1
PDF	https://arxiv.org/pdf/1905.08088v1.pdf
PWC	https://paperswithcode.com/paper/graph-mining-meets-crowdsourcing-extracting
Repo
Framework

The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation


Title	The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation
Authors	Richard Yuanzhe Pang
Abstract	The difficulty of textual style transfer lies in the lack of parallel corpora. Numerous advances have been proposed for the unsupervised generation. However, significant problems remain with the auto-evaluation of style transfer tasks. Based on the summary of Pang and Gimpel (2018) and Mir et al. (2019), style transfer evaluations rely on three criteria: style accuracy of transferred sentences, content similarity between original and transferred sentences, and fluency of transferred sentences. We elucidate the problematic current state of style transfer research. Given that current tasks do not represent real use cases of style transfer, current auto-evaluation approach is flawed. This discussion aims to bring researchers to think about the future of style transfer and style transfer evaluation research.
Tasks	Style Transfer
Published	2019-10-09
URL	https://arxiv.org/abs/1910.03747v2
PDF	https://arxiv.org/pdf/1910.03747v2.pdf
PWC	https://paperswithcode.com/paper/the-daunting-task-of-real-world-textual-style
Repo
Framework

Approximating probabilistic models as weighted finite automata


Title	Approximating probabilistic models as weighted finite automata
Authors	Ananda Theertha Suresh, Brian Roark, Michael Riley, Vlad Schogol
Abstract	Weighted finite automata (WFA) are often used to represent probabilistic models, such as $n$-gram language models, since they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a weighted finite automaton such that the Kullback-Leiber divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization step, both of which can be performed efficiently. We demonstrate the usefulness of our approach on various tasks, including distilling $n$-gram models from neural models, building compact language models, and building open-vocabulary character models. The algorithms used for these experiments are available in an open-source software library.
Tasks
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08701v2
PDF	https://arxiv.org/pdf/1905.08701v2.pdf
PWC	https://paperswithcode.com/paper/approximating-probabilistic-models-as
Repo
Framework

CAESAR source finder: recent developments and testing


Title	CAESAR source finder: recent developments and testing
Authors	S. Riggi, F. Vitello, U. Becciani, C. Buemi, F. Bufano, A. Calanducci, F. Cavallaro, A. Costa, A. Ingallinera, P. Leto, S. Loru, R. P. Norris, F. Schillirò, E. Sciacca, C. Trigilio, G. Umana
Abstract	A new era in radioastronomy will begin with the upcoming large-scale surveys planned at the Australian Square Kilometre Array Pathfinder (ASKAP). ASKAP started its Early Science program in October 2017 and several target fields were observed during the array commissioning phase. The SCORPIO field was the first observed in the Galactic Plane in Band 1 (792-1032 MHz) using 15 commissioned antennas. The achieved sensitivity and large field of view already allow to discover new sources and survey thousands of existing ones with improved precision with respect to previous surveys. Data analysis is currently ongoing to deliver the first source catalogue. Given the increased scale of the data, source extraction and characterization, even in this Early Science phase, have to be carried out in a mostly automated way. This process presents significant challenges due to the presence of extended objects and diffuse emission close to the Galactic Plane. In this context we have extended and optimized a novel source finding tool, named CAESAR , to allow extraction of both compact and extended sources from radio maps. A number of developments have been done driven by the analysis of the SCORPIO map and in view of the future ASKAP Galactic Plane survey. The main goals are the improvement of algorithm performances and scalability as well as of software maintainability and usability within the radio community. In this paper we present the current status of CAESAR and report a first systematic characterization of its performance for both compact and extended sources using simulated maps. Future prospects are discussed in light of the obtained results.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06116v1
PDF	https://arxiv.org/pdf/1909.06116v1.pdf
PWC	https://paperswithcode.com/paper/caesar-source-finder-recent-developments-and
Repo
Framework

Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PAD Systems


Title	Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PAD Systems
Authors	Giulia Orrù, Pierluigi Tuveri, Luca Ghiani, Gian Luca Marcialis
Abstract	Fingerprint Liveness detection, or presentation attacks detection (PAD), that is, the ability of detecting if a fingerprint submitted to an electronic capture device is authentic or made up of some artificial materials, boosted the attention of the scientific community and recently machine learning approaches based on deep networks opened novel scenarios. A significant step ahead was due thanks to the public availability of large sets of data; in particular, the ones released during the International Fingerprint Liveness Detection Competition (LivDet). Among others, the fifth edition carried on in 2017, challenged the participants in two more challenges which were not detailed in the official report. In this paper, we want to extend that report by focusing on them: the first one was aimed at exploring the case in which the PAD is integrated into a fingerprint verification systems, where templates of users are available too and the designer is not constrained to refer only to a generic users population for the PAD settings. The second one faces with the exploitation ability of attackers of the provided fakes, and how this ability impacts on the final performance. These two challenges together may set at which extent the fingerprint presentation attacks are an actual threat and how to exploit additional information to make the PAD more effective.
Tasks
Published	2019-07-18
URL	https://arxiv.org/abs/1907.08068v1
PDF	https://arxiv.org/pdf/1907.08068v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-user-specific-effect-and-impact
Repo
Framework

Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC


Title	Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC
Authors	Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Thomas Joy, Luigi Di Stefano, Simon Walker, Philip H. S. Torr
Abstract	Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of the more advanced algorithms, e.g. those that rely on unpredictable access to memory or are highly iterative in nature, are difficult to deploy efficiently on FPGAs, and thus the depth quality that can be achieved is limited. In this paper, we leverage a FPGA-CPU chip to propose a novel, sophisticated, stereo approach that combines the best features of SGM and ELAS-based methods to compute highly accurate dense depth in real time. Our approach achieves an 8.7% error rate on the challenging KITTI 2015 dataset at over 50 FPS, with a power consumption of only 5W.
Tasks
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07745v1
PDF	https://arxiv.org/pdf/1907.07745v1.pdf
PWC	https://paperswithcode.com/paper/real-time-highly-accurate-dense-depth-on-a
Repo
Framework

Machine Learning for Precipitation Nowcasting from Radar Images


Title	Machine Learning for Precipitation Nowcasting from Radar Images
Authors	Shreya Agrawal, Luke Barrington, Carla Bromberg, John Burge, Cenk Gazen, Jason Hickey
Abstract	High-resolution nowcasting is an essential tool needed for effective adaptation to climate change, particularly for extreme weather. As Deep Learning (DL) techniques have shown dramatic promise in many domains, including the geosciences, we present an application of DL to the problem of precipitation nowcasting, i.e., high-resolution (1 km x 1 km) short-term (1 hour) predictions of precipitation. We treat forecasting as an image-to-image translation problem and leverage the power of the ubiquitous UNET convolutional neural network. We find this performs favorably when compared to three commonly used models: optical flow, persistence and NOAA’s numerical one-hour HRRR nowcasting prediction.
Tasks	Image-to-Image Translation, Optical Flow Estimation
Published	2019-12-11
URL	https://arxiv.org/abs/1912.12132v1
PDF	https://arxiv.org/pdf/1912.12132v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-precipitation-nowcasting
Repo
Framework

Representation Learning with Multisets


Title	Representation Learning with Multisets
Authors	Vasco Portilheiro
Abstract	We study the problem of learning permutation invariant representations that can capture “flexible” notions of containment. We formalize this problem via a measure theoretic definition of multisets, and obtain a theoretically-motivated learning model. We propose training this model on a novel task: predicting the size of the symmetric difference (or intersection) between pairs of multisets. We demonstrate that our model not only performs very well on predicting containment relations (and more effectively predicts the sizes of symmetric differences and intersections than DeepSets-based approaches with unconstrained object representations), but that it also learns meaningful representations.
Tasks	Representation Learning
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08577v1
PDF	https://arxiv.org/pdf/1911.08577v1.pdf
PWC	https://paperswithcode.com/paper/representation-learning-with-multisets-1
Repo
Framework


Title	Cross-modal supervised learning for better acoustic representations
Authors	Shaoyong Jia, Xin Shu, Yang Yang, Dawei Liang, Qiyue Liu, Junhui Liu
Abstract	Obtaining large-scale human-labeled datasets to train acoustic representation models is a very challenging task. On the contrary, we can easily collect data with machine-generated labels. In this work, we propose to exploit machine-generated labels to learn better acoustic representations, based on the synchronization between vision and audio. Firstly, we collect a large-scale video dataset with 15 million samples, which totally last 16,320 hours. Each video is 3 to 5 seconds in length and annotated automatically by publicly available visual and audio classification models. Secondly, we train various classical convolutional neural networks (CNNs) including VGGish, ResNet 50 and Mobilenet v2. We also make several improvements to VGGish and achieve better results. Finally, we transfer our models on three external standard benchmarks for audio classification task, and achieve significant performance boost over the state-of-the-art results. Models and codes are available at: https://github.com/Deeperjia/vgg-like-audio-models.
Tasks	Audio Classification
Published	2019-11-15
URL	https://arxiv.org/abs/1911.07917v2
PDF	https://arxiv.org/pdf/1911.07917v2.pdf
PWC	https://paperswithcode.com/paper/cross-modal-supervised-learning-for-better
Repo
Framework

Model-Agnostic Counterfactual Explanations for Consequential Decisions


Title	Model-Agnostic Counterfactual Explanations for Consequential Decisions
Authors	Amir-Hossein Karimi, Gilles Barthe, Borja Balle, Isabel Valera
Abstract	Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide explanations that help the affected individuals not only to understand why a prediction was output, but also how to act to obtain a desired outcome. To this end, several works have proposed optimization-based methods to generate nearest counterfactual explanations. However, these methods are often restricted to a particular subset of models (e.g., decision trees or linear models) and differentiable distance functions. In contrast, we build on standard theory and tools from formal verification and propose a novel algorithm that solves a sequence of satisfiability problems, where both the distance function (objective) and predictive model (constraints) are represented as logic formulae. As shown by our experiments on real-world data, our algorithm is: i) model-agnostic ({non-}linear, {non-}differentiable, {non-}convex); ii) data-type-agnostic (heterogeneous features); iii) distance-agnostic ($\ell_0, \ell_1, \ell_\infty$, and combinations thereof); iv) able to generate plausible and diverse counterfactuals for any sample (i.e., 100% coverage); and v) at provably optimal distances.
Tasks	Decision Making
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11190v5
PDF	https://arxiv.org/pdf/1905.11190v5.pdf
PWC	https://paperswithcode.com/paper/model-agnostic-counterfactual-explanations
Repo
Framework

Personalized Music Recommendation with Triplet Network


Title	Personalized Music Recommendation with Triplet Network
Authors	Haoting Liang, Donghuo Zeng, Yi Yu, Keizo Oyama
Abstract	Since many online music services emerged in recent years so that effective music recommendation systems are desirable. Some common problems in recommendation system like feature representations, distance measure and cold start problems are also challenges for music recommendation. In this paper, I proposed a triplet neural network, exploiting both positive and negative samples to learn the representation and distance measure between users and items, to solve the recommendation task.
Tasks	Recommendation Systems
Published	2019-08-10
URL	https://arxiv.org/abs/1908.03738v1
PDF	https://arxiv.org/pdf/1908.03738v1.pdf
PWC	https://paperswithcode.com/paper/personalized-music-recommendation-with
Repo
Framework

Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases


Title	Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases
Authors	Estelle Maudet, Oralie Cattan, Maureen de Seyssel, Christophe Servan
Abstract	This paper reports on Qwant Research contribution to tasks 2 and 3 of the DEFT 2019’s challenge, focusing on French clinical cases analysis. Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have developed an information extraction system yielding very encouraging results accuracy-wise. We have experimented two different approaches, one based on the exclusive use of neural networks, the other based on a linguistic analysis.
Tasks	Information Retrieval, Semantic Similarity, Semantic Textual Similarity
Published	2019-07-06
URL	https://arxiv.org/abs/1907.05790v1
PDF	https://arxiv.org/pdf/1907.05790v1.pdf
PWC	https://paperswithcode.com/paper/qwant-research-deft-2019-document-matching
Repo
Framework

On the Role of Time in Learning


Title	On the Role of Time in Learning
Authors	Alessandro Betti, Marco Gori
Abstract	By and large the process of learning concepts that are embedded in time is regarded as quite a mature research topic. Hidden Markov models, recurrent neural networks are, amongst others, successful approaches to learning from temporal data. In this paper, we claim that the dominant approach minimizing appropriate risk functions defined over time by classic stochastic gradient might miss the deep interpretation of time given in other fields like physics. We show that a recent reformulation of learning according to the principle of Least Cognitive Action is better suited whenever time is involved in learning. The principle gives rise to a learning process that is driven by differential equations, that can somehow descrive the process within the same framework as other laws of nature.
Tasks
Published	2019-07-14
URL	https://arxiv.org/abs/1907.06198v1
PDF	https://arxiv.org/pdf/1907.06198v1.pdf
PWC	https://paperswithcode.com/paper/on-the-role-of-time-in-learning
Repo
Framework

An Efficient 3D CNN for Action/Object Segmentation in Video


Title	An Efficient 3D CNN for Action/Object Segmentation in Video
Authors	Rui Hou, Chen Chen, Rahul Sukthankar, Mubarak Shah
Abstract	Convolutional Neural Network (CNN) based image segmentation has made great progress in recent years. However, video object segmentation remains a challenging task due to its high computational complexity. Most of the previous methods employ a two-stream CNN framework to handle spatial and motion features separately. In this paper, we propose an end-to-end encoder-decoder style 3D CNN to aggregate spatial and temporal information simultaneously for video object segmentation. To efficiently process video, we propose 3D separable convolution for the pyramid pooling module and decoder, which dramatically reduces the number of operations while maintaining the performance. Moreover, we also extend our framework to video action segmentation by adding an extra classifier to predict the action label for actors in videos. Extensive experiments on several video datasets demonstrate the superior performance of the proposed approach for action and object segmentation compared to the state-of-the-art.
Tasks	action segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published	2019-07-21
URL	https://arxiv.org/abs/1907.08895v1
PDF	https://arxiv.org/pdf/1907.08895v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-3d-cnn-for-actionobject
Repo
Framework

V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation


Title	V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation
Authors	Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis
Abstract	We propose V2CNet, a new deep learning framework to automatically translate the demonstration videos to commands that can be directly used in robotic applications. Our V2CNet has two branches and aims at understanding the demonstration video in a fine-grained manner. The first branch has the encoder-decoder architecture to encode the visual features and sequentially generate the output words as a command, while the second branch uses a Temporal Convolutional Network (TCN) to learn the fine-grained actions. By jointly training both branches, the network is able to model the sequential information of the command, while effectively encodes the fine-grained actions. The experimental results on our new large-scale dataset show that V2CNet outperforms recent state-of-the-art methods by a substantial margin, while its output can be applied in real robotic applications. The source code and trained models will be made available.
Tasks
Published	2019-03-23
URL	http://arxiv.org/abs/1903.10869v1
PDF	http://arxiv.org/pdf/1903.10869v1.pdf
PWC	https://paperswithcode.com/paper/v2cnet-a-deep-learning-framework-to-translate
Repo
Framework