Paper Group ANR 266
Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation. The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation. Approximating probabilistic models as weighted finite automata. CAESAR source finder: recent developments and testing. Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PA …
Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation
Title | Graph Mining Meets Crowdsourcing: Extracting Experts for Answer Aggregation |
Authors | Yasushi Kawase, Yuko Kuroki, Atsushi Miyauchi |
Abstract | Aggregating responses from crowd workers is a fundamental task in the process of crowdsourcing. In cases where a few experts are overwhelmed by a large number of non-experts, most answer aggregation algorithms such as the majority voting fail to identify the correct answers. Therefore, it is crucial to extract reliable experts from the crowd workers. In this study, we introduce the notion of “expert core”, which is a set of workers that is very unlikely to contain a non-expert. We design a graph-mining-based efficient algorithm that exactly computes the expert core. To answer the aggregation task, we propose two types of algorithms. The first one incorporates the expert core into existing answer aggregation algorithms such as the majority voting, whereas the second one utilizes information provided by the expert core extraction algorithm pertaining to the reliability of workers. We then give a theoretical justification for the first type of algorithm. Computational experiments using synthetic and real-world datasets demonstrate that our proposed answer aggregation algorithms outperform state-of-the-art algorithms. |
Tasks | |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.08088v1 |
https://arxiv.org/pdf/1905.08088v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-mining-meets-crowdsourcing-extracting |
Repo | |
Framework | |
The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation
Title | The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation |
Authors | Richard Yuanzhe Pang |
Abstract | The difficulty of textual style transfer lies in the lack of parallel corpora. Numerous advances have been proposed for the unsupervised generation. However, significant problems remain with the auto-evaluation of style transfer tasks. Based on the summary of Pang and Gimpel (2018) and Mir et al. (2019), style transfer evaluations rely on three criteria: style accuracy of transferred sentences, content similarity between original and transferred sentences, and fluency of transferred sentences. We elucidate the problematic current state of style transfer research. Given that current tasks do not represent real use cases of style transfer, current auto-evaluation approach is flawed. This discussion aims to bring researchers to think about the future of style transfer and style transfer evaluation research. |
Tasks | Style Transfer |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03747v2 |
https://arxiv.org/pdf/1910.03747v2.pdf | |
PWC | https://paperswithcode.com/paper/the-daunting-task-of-real-world-textual-style |
Repo | |
Framework | |
Approximating probabilistic models as weighted finite automata
Title | Approximating probabilistic models as weighted finite automata |
Authors | Ananda Theertha Suresh, Brian Roark, Michael Riley, Vlad Schogol |
Abstract | Weighted finite automata (WFA) are often used to represent probabilistic models, such as $n$-gram language models, since they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a weighted finite automaton such that the Kullback-Leiber divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization step, both of which can be performed efficiently. We demonstrate the usefulness of our approach on various tasks, including distilling $n$-gram models from neural models, building compact language models, and building open-vocabulary character models. The algorithms used for these experiments are available in an open-source software library. |
Tasks | |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08701v2 |
https://arxiv.org/pdf/1905.08701v2.pdf | |
PWC | https://paperswithcode.com/paper/approximating-probabilistic-models-as |
Repo | |
Framework | |
CAESAR source finder: recent developments and testing
Title | CAESAR source finder: recent developments and testing |
Authors | S. Riggi, F. Vitello, U. Becciani, C. Buemi, F. Bufano, A. Calanducci, F. Cavallaro, A. Costa, A. Ingallinera, P. Leto, S. Loru, R. P. Norris, F. Schillirò, E. Sciacca, C. Trigilio, G. Umana |
Abstract | A new era in radioastronomy will begin with the upcoming large-scale surveys planned at the Australian Square Kilometre Array Pathfinder (ASKAP). ASKAP started its Early Science program in October 2017 and several target fields were observed during the array commissioning phase. The SCORPIO field was the first observed in the Galactic Plane in Band 1 (792-1032 MHz) using 15 commissioned antennas. The achieved sensitivity and large field of view already allow to discover new sources and survey thousands of existing ones with improved precision with respect to previous surveys. Data analysis is currently ongoing to deliver the first source catalogue. Given the increased scale of the data, source extraction and characterization, even in this Early Science phase, have to be carried out in a mostly automated way. This process presents significant challenges due to the presence of extended objects and diffuse emission close to the Galactic Plane. In this context we have extended and optimized a novel source finding tool, named CAESAR , to allow extraction of both compact and extended sources from radio maps. A number of developments have been done driven by the analysis of the SCORPIO map and in view of the future ASKAP Galactic Plane survey. The main goals are the improvement of algorithm performances and scalability as well as of software maintainability and usability within the radio community. In this paper we present the current status of CAESAR and report a first systematic characterization of its performance for both compact and extended sources using simulated maps. Future prospects are discussed in light of the obtained results. |
Tasks | |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06116v1 |
https://arxiv.org/pdf/1909.06116v1.pdf | |
PWC | https://paperswithcode.com/paper/caesar-source-finder-recent-developments-and |
Repo | |
Framework | |
Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PAD Systems
Title | Analysis of “User-Specific Effect” and Impact of Operator Skills on Fingerprint PAD Systems |
Authors | Giulia Orrù, Pierluigi Tuveri, Luca Ghiani, Gian Luca Marcialis |
Abstract | Fingerprint Liveness detection, or presentation attacks detection (PAD), that is, the ability of detecting if a fingerprint submitted to an electronic capture device is authentic or made up of some artificial materials, boosted the attention of the scientific community and recently machine learning approaches based on deep networks opened novel scenarios. A significant step ahead was due thanks to the public availability of large sets of data; in particular, the ones released during the International Fingerprint Liveness Detection Competition (LivDet). Among others, the fifth edition carried on in 2017, challenged the participants in two more challenges which were not detailed in the official report. In this paper, we want to extend that report by focusing on them: the first one was aimed at exploring the case in which the PAD is integrated into a fingerprint verification systems, where templates of users are available too and the designer is not constrained to refer only to a generic users population for the PAD settings. The second one faces with the exploitation ability of attackers of the provided fakes, and how this ability impacts on the final performance. These two challenges together may set at which extent the fingerprint presentation attacks are an actual threat and how to exploit additional information to make the PAD more effective. |
Tasks | |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08068v1 |
https://arxiv.org/pdf/1907.08068v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-user-specific-effect-and-impact |
Repo | |
Framework | |
Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC
Title | Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC |
Authors | Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Thomas Joy, Luigi Di Stefano, Simon Walker, Philip H. S. Torr |
Abstract | Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of the more advanced algorithms, e.g. those that rely on unpredictable access to memory or are highly iterative in nature, are difficult to deploy efficiently on FPGAs, and thus the depth quality that can be achieved is limited. In this paper, we leverage a FPGA-CPU chip to propose a novel, sophisticated, stereo approach that combines the best features of SGM and ELAS-based methods to compute highly accurate dense depth in real time. Our approach achieves an 8.7% error rate on the challenging KITTI 2015 dataset at over 50 FPS, with a power consumption of only 5W. |
Tasks | |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07745v1 |
https://arxiv.org/pdf/1907.07745v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-highly-accurate-dense-depth-on-a |
Repo | |
Framework | |
Machine Learning for Precipitation Nowcasting from Radar Images
Title | Machine Learning for Precipitation Nowcasting from Radar Images |
Authors | Shreya Agrawal, Luke Barrington, Carla Bromberg, John Burge, Cenk Gazen, Jason Hickey |
Abstract | High-resolution nowcasting is an essential tool needed for effective adaptation to climate change, particularly for extreme weather. As Deep Learning (DL) techniques have shown dramatic promise in many domains, including the geosciences, we present an application of DL to the problem of precipitation nowcasting, i.e., high-resolution (1 km x 1 km) short-term (1 hour) predictions of precipitation. We treat forecasting as an image-to-image translation problem and leverage the power of the ubiquitous UNET convolutional neural network. We find this performs favorably when compared to three commonly used models: optical flow, persistence and NOAA’s numerical one-hour HRRR nowcasting prediction. |
Tasks | Image-to-Image Translation, Optical Flow Estimation |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.12132v1 |
https://arxiv.org/pdf/1912.12132v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-for-precipitation-nowcasting |
Repo | |
Framework | |
Representation Learning with Multisets
Title | Representation Learning with Multisets |
Authors | Vasco Portilheiro |
Abstract | We study the problem of learning permutation invariant representations that can capture “flexible” notions of containment. We formalize this problem via a measure theoretic definition of multisets, and obtain a theoretically-motivated learning model. We propose training this model on a novel task: predicting the size of the symmetric difference (or intersection) between pairs of multisets. We demonstrate that our model not only performs very well on predicting containment relations (and more effectively predicts the sizes of symmetric differences and intersections than DeepSets-based approaches with unconstrained object representations), but that it also learns meaningful representations. |
Tasks | Representation Learning |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08577v1 |
https://arxiv.org/pdf/1911.08577v1.pdf | |
PWC | https://paperswithcode.com/paper/representation-learning-with-multisets-1 |
Repo | |
Framework | |
Cross-modal supervised learning for better acoustic representations
Title | Cross-modal supervised learning for better acoustic representations |
Authors | Shaoyong Jia, Xin Shu, Yang Yang, Dawei Liang, Qiyue Liu, Junhui Liu |
Abstract | Obtaining large-scale human-labeled datasets to train acoustic representation models is a very challenging task. On the contrary, we can easily collect data with machine-generated labels. In this work, we propose to exploit machine-generated labels to learn better acoustic representations, based on the synchronization between vision and audio. Firstly, we collect a large-scale video dataset with 15 million samples, which totally last 16,320 hours. Each video is 3 to 5 seconds in length and annotated automatically by publicly available visual and audio classification models. Secondly, we train various classical convolutional neural networks (CNNs) including VGGish, ResNet 50 and Mobilenet v2. We also make several improvements to VGGish and achieve better results. Finally, we transfer our models on three external standard benchmarks for audio classification task, and achieve significant performance boost over the state-of-the-art results. Models and codes are available at: https://github.com/Deeperjia/vgg-like-audio-models. |
Tasks | Audio Classification |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1911.07917v2 |
https://arxiv.org/pdf/1911.07917v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-modal-supervised-learning-for-better |
Repo | |
Framework | |
Model-Agnostic Counterfactual Explanations for Consequential Decisions
Title | Model-Agnostic Counterfactual Explanations for Consequential Decisions |
Authors | Amir-Hossein Karimi, Gilles Barthe, Borja Balle, Isabel Valera |
Abstract | Predictive models are being increasingly used to support consequential decision making at the individual level in contexts such as pretrial bail and loan approval. As a result, there is increasing social and legal pressure to provide explanations that help the affected individuals not only to understand why a prediction was output, but also how to act to obtain a desired outcome. To this end, several works have proposed optimization-based methods to generate nearest counterfactual explanations. However, these methods are often restricted to a particular subset of models (e.g., decision trees or linear models) and differentiable distance functions. In contrast, we build on standard theory and tools from formal verification and propose a novel algorithm that solves a sequence of satisfiability problems, where both the distance function (objective) and predictive model (constraints) are represented as logic formulae. As shown by our experiments on real-world data, our algorithm is: i) model-agnostic ({non-}linear, {non-}differentiable, {non-}convex); ii) data-type-agnostic (heterogeneous features); iii) distance-agnostic ($\ell_0, \ell_1, \ell_\infty$, and combinations thereof); iv) able to generate plausible and diverse counterfactuals for any sample (i.e., 100% coverage); and v) at provably optimal distances. |
Tasks | Decision Making |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11190v5 |
https://arxiv.org/pdf/1905.11190v5.pdf | |
PWC | https://paperswithcode.com/paper/model-agnostic-counterfactual-explanations |
Repo | |
Framework | |
Personalized Music Recommendation with Triplet Network
Title | Personalized Music Recommendation with Triplet Network |
Authors | Haoting Liang, Donghuo Zeng, Yi Yu, Keizo Oyama |
Abstract | Since many online music services emerged in recent years so that effective music recommendation systems are desirable. Some common problems in recommendation system like feature representations, distance measure and cold start problems are also challenges for music recommendation. In this paper, I proposed a triplet neural network, exploiting both positive and negative samples to learn the representation and distance measure between users and items, to solve the recommendation task. |
Tasks | Recommendation Systems |
Published | 2019-08-10 |
URL | https://arxiv.org/abs/1908.03738v1 |
https://arxiv.org/pdf/1908.03738v1.pdf | |
PWC | https://paperswithcode.com/paper/personalized-music-recommendation-with |
Repo | |
Framework | |
Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases
Title | Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases |
Authors | Estelle Maudet, Oralie Cattan, Maureen de Seyssel, Christophe Servan |
Abstract | This paper reports on Qwant Research contribution to tasks 2 and 3 of the DEFT 2019’s challenge, focusing on French clinical cases analysis. Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have developed an information extraction system yielding very encouraging results accuracy-wise. We have experimented two different approaches, one based on the exclusive use of neural networks, the other based on a linguistic analysis. |
Tasks | Information Retrieval, Semantic Similarity, Semantic Textual Similarity |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.05790v1 |
https://arxiv.org/pdf/1907.05790v1.pdf | |
PWC | https://paperswithcode.com/paper/qwant-research-deft-2019-document-matching |
Repo | |
Framework | |
On the Role of Time in Learning
Title | On the Role of Time in Learning |
Authors | Alessandro Betti, Marco Gori |
Abstract | By and large the process of learning concepts that are embedded in time is regarded as quite a mature research topic. Hidden Markov models, recurrent neural networks are, amongst others, successful approaches to learning from temporal data. In this paper, we claim that the dominant approach minimizing appropriate risk functions defined over time by classic stochastic gradient might miss the deep interpretation of time given in other fields like physics. We show that a recent reformulation of learning according to the principle of Least Cognitive Action is better suited whenever time is involved in learning. The principle gives rise to a learning process that is driven by differential equations, that can somehow descrive the process within the same framework as other laws of nature. |
Tasks | |
Published | 2019-07-14 |
URL | https://arxiv.org/abs/1907.06198v1 |
https://arxiv.org/pdf/1907.06198v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-role-of-time-in-learning |
Repo | |
Framework | |
An Efficient 3D CNN for Action/Object Segmentation in Video
Title | An Efficient 3D CNN for Action/Object Segmentation in Video |
Authors | Rui Hou, Chen Chen, Rahul Sukthankar, Mubarak Shah |
Abstract | Convolutional Neural Network (CNN) based image segmentation has made great progress in recent years. However, video object segmentation remains a challenging task due to its high computational complexity. Most of the previous methods employ a two-stream CNN framework to handle spatial and motion features separately. In this paper, we propose an end-to-end encoder-decoder style 3D CNN to aggregate spatial and temporal information simultaneously for video object segmentation. To efficiently process video, we propose 3D separable convolution for the pyramid pooling module and decoder, which dramatically reduces the number of operations while maintaining the performance. Moreover, we also extend our framework to video action segmentation by adding an extra classifier to predict the action label for actors in videos. Extensive experiments on several video datasets demonstrate the superior performance of the proposed approach for action and object segmentation compared to the state-of-the-art. |
Tasks | action segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.08895v1 |
https://arxiv.org/pdf/1907.08895v1.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-3d-cnn-for-actionobject |
Repo | |
Framework | |
V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation
Title | V2CNet: A Deep Learning Framework to Translate Videos to Commands for Robotic Manipulation |
Authors | Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis |
Abstract | We propose V2CNet, a new deep learning framework to automatically translate the demonstration videos to commands that can be directly used in robotic applications. Our V2CNet has two branches and aims at understanding the demonstration video in a fine-grained manner. The first branch has the encoder-decoder architecture to encode the visual features and sequentially generate the output words as a command, while the second branch uses a Temporal Convolutional Network (TCN) to learn the fine-grained actions. By jointly training both branches, the network is able to model the sequential information of the command, while effectively encodes the fine-grained actions. The experimental results on our new large-scale dataset show that V2CNet outperforms recent state-of-the-art methods by a substantial margin, while its output can be applied in real robotic applications. The source code and trained models will be made available. |
Tasks | |
Published | 2019-03-23 |
URL | http://arxiv.org/abs/1903.10869v1 |
http://arxiv.org/pdf/1903.10869v1.pdf | |
PWC | https://paperswithcode.com/paper/v2cnet-a-deep-learning-framework-to-translate |
Repo | |
Framework | |