January 29, 2020

3021 words 15 mins read

Paper Group ANR 643

Paper Group ANR 643

Handling Divergent Reference Texts when Evaluating Table-to-Text Generation. Characterizing Activity on the Deep and Dark Web. CUNI System for the WMT19 Robustness Task. Literature Review of Action Recognition in the Wild. Learned Video Compression via Joint Spatial-Temporal Correlation Exploration. Online PCB Defect Detector On A New PCB Defect Da …

Handling Divergent Reference Texts when Evaluating Table-to-Text Generation

Title Handling Divergent Reference Texts when Evaluating Table-to-Text Generation
Authors Bhuwan Dhingra, Manaal Faruqui, Ankur Parikh, Ming-Wei Chang, Dipanjan Das, William W. Cohen
Abstract Automatically constructed datasets for generating text from semi-structured data (tables), such as WikiBio, often contain reference texts that diverge from the information in the corresponding semi-structured data. We show that metrics which rely solely on the reference texts, such as BLEU and ROUGE, show poor correlation with human judgments when those references diverge. We propose a new metric, PARENT, which aligns n-grams from the reference and generated texts to the semi-structured data before computing their precision and recall. Through a large scale human evaluation study of table-to-text models for WikiBio, we show that PARENT correlates with human judgments better than existing text generation metrics. We also adapt and evaluate the information extraction based evaluation proposed by Wiseman et al (2017), and show that PARENT has comparable correlation to it, while being easier to use. We show that PARENT is also applicable when the reference texts are elicited from humans using the data from the WebNLG challenge.
Tasks Table-to-Text Generation, Text Generation
Published 2019-06-03
URL https://arxiv.org/abs/1906.01081v1
PDF https://arxiv.org/pdf/1906.01081v1.pdf
PWC https://paperswithcode.com/paper/handling-divergent-reference-texts-when
Repo
Framework

Characterizing Activity on the Deep and Dark Web

Title Characterizing Activity on the Deep and Dark Web
Authors Nazgol Tavabi, Nathan Bartley, Andrés Abeliuk, Sandeep Soni, Emilio Ferrara, Kristina Lerman
Abstract The deep and darkweb (d2web) refers to limited access web sites that require registration, authentication, or more complex encryption protocols to access them. These web sites serve as hubs for a variety of illicit activities: to trade drugs, stolen user credentials, hacking tools, and to coordinate attacks and manipulation campaigns. Despite its importance to cyber crime, the d2web has not been systematically investigated. In this paper, we study a large corpus of messages posted to 80 d2web forums over a period of more than a year. We identify topics of discussion using LDA and use a non-parametric HMM to model the evolution of topics across forums. Then, we examine the dynamic patterns of discussion and identify forums with similar patterns. We show that our approach surfaces hidden similarities across different forums and can help identify anomalous events in this rich, heterogeneous data.
Tasks
Published 2019-03-01
URL http://arxiv.org/abs/1903.00156v1
PDF http://arxiv.org/pdf/1903.00156v1.pdf
PWC https://paperswithcode.com/paper/characterizing-activity-on-the-deep-and-dark
Repo
Framework

CUNI System for the WMT19 Robustness Task

Title CUNI System for the WMT19 Robustness Task
Authors Jindřich Helcl, Jindřich Libovický, Martin Popel
Abstract We present our submission to the WMT19 Robustness Task. Our baseline system is the Charles University (CUNI) Transformer system trained for the WMT18 shared task on News Translation. Quantitative results show that the CUNI Transformer system is already far more robust to noisy input than the LSTM-based baseline provided by the task organizers. We further improved the performance of our model by fine-tuning on the in-domain noisy data without influencing the translation quality on the news domain.
Tasks
Published 2019-06-21
URL https://arxiv.org/abs/1906.09246v1
PDF https://arxiv.org/pdf/1906.09246v1.pdf
PWC https://paperswithcode.com/paper/cuni-system-for-the-wmt19-robustness-task
Repo
Framework

Literature Review of Action Recognition in the Wild

Title Literature Review of Action Recognition in the Wild
Authors Asket Kaur, Navya Rao, Tanya Joon
Abstract The literature review presented below on Action Recognition in the wild is the in-depth study of Research Papers. Action Recognition problem in the untrimmed videos is a challenging task and most of the papers have tackled this problem using hand-crafted features with shallow learning techniques and sophisticated end-to-end deep learning techniques.
Tasks
Published 2019-11-27
URL https://arxiv.org/abs/1911.12249v1
PDF https://arxiv.org/pdf/1911.12249v1.pdf
PWC https://paperswithcode.com/paper/literature-review-of-action-recognition-in
Repo
Framework

Learned Video Compression via Joint Spatial-Temporal Correlation Exploration

Title Learned Video Compression via Joint Spatial-Temporal Correlation Exploration
Authors Haojie Liu, Han shen, Lichao Huang, Ming Lu, Tong Chen, Zhan Ma
Abstract Traditional video compression technologies have been developed over decades in pursuit of higher coding efficiency. Efficient temporal information representation plays a key role in video coding. Thus, in this paper, we propose to exploit the temporal correlation using both first-order optical flow and second-order flow prediction. We suggest an one-stage learning approach to encapsulate flow as quantized features from consecutive frames which is then entropy coded with adaptive contexts conditioned on joint spatial-temporal priors to exploit second-order correlations. Joint priors are embedded in autoregressive spatial neighbors, co-located hyper elements and temporal neighbors using ConvLSTM recurrently. We evaluate our approach for the low-delay scenario with High-Efficiency Video Coding (H.265/HEVC), H.264/AVC and another learned video compression method, following the common test settings. Our work offers the state-of-the-art performance, with consistent gains across all popular test sequences.
Tasks Optical Flow Estimation, Video Compression
Published 2019-12-13
URL https://arxiv.org/abs/1912.06348v1
PDF https://arxiv.org/pdf/1912.06348v1.pdf
PWC https://paperswithcode.com/paper/learned-video-compression-via-joint-spatial
Repo
Framework

Online PCB Defect Detector On A New PCB Defect Dataset

Title Online PCB Defect Detector On A New PCB Defect Dataset
Authors Sanli Tang, Fan He, Xiaolin Huang, Jie Yang
Abstract Previous works for PCB defect detection based on image difference and image processing techniques have already achieved promising performance. However, they sometimes fall short because of the unaccounted defect patterns or over-sensitivity about some hyper-parameters. In this work, we design a deep model that accurately detects PCB defects from an input pair of a detect-free template and a defective tested image. A novel group pyramid pooling module is proposed to efficiently extract features of a large range of resolutions, which are merged by group to predict PCB defect of corresponding scales. To train the deep model, a dataset is established, namely DeepPCB, which contains 1,500 image pairs with annotations including positions of 6 common types of PCB defects. Experiment results validate the effectiveness and efficiency of the proposed model by achieving $98.6%$ mAP @ 62 FPS on DeepPCB dataset. This dataset is now available at: https://github.com/tangsanli5201/DeepPCB.
Tasks
Published 2019-02-17
URL http://arxiv.org/abs/1902.06197v1
PDF http://arxiv.org/pdf/1902.06197v1.pdf
PWC https://paperswithcode.com/paper/online-pcb-defect-detector-on-a-new-pcb
Repo
Framework

Variable Rate Deep Image Compression with Modulated Autoencoder

Title Variable Rate Deep Image Compression with Modulated Autoencoder
Authors Fei Yang, Luis Herranz, Joost van de Weijer, José A. Iglesias Guitián, Antonio López, Mikhail Mozerov
Abstract Variable rate is a requirement for flexible and adaptable image and video compression. However, deep image compression methods are optimized for a single fixed rate-distortion tradeoff. While this can be addressed by training multiple models for different tradeoffs, the memory requirements increase proportionally to the number of models. Scaling the bottleneck representation of a shared autoencoder can provide variable rate compression with a single shared autoencoder. However, the R-D performance using this simple mechanism degrades in low bitrates, and also shrinks the effective range of bit rates. Addressing these limitations, we formulate the problem of variable rate-distortion optimization for deep image compression, and propose modulated autoencoders (MAEs), where the representations of a shared autoencoder are adapted to the specific rate-distortion tradeoff via a modulation network. Jointly training this modulated autoencoder and modulation network provides an effective way to navigate the R-D operational curve. Our experiments show that the proposed method can achieve almost the same R-D performance of independent models with significantly fewer parameters.
Tasks Image Compression, Video Compression
Published 2019-12-11
URL https://arxiv.org/abs/1912.05526v1
PDF https://arxiv.org/pdf/1912.05526v1.pdf
PWC https://paperswithcode.com/paper/variable-rate-deep-image-compression-with
Repo
Framework

Hangul Fonts Dataset: a Hierarchical and Compositional Dataset for Interrogating Learned Representations

Title Hangul Fonts Dataset: a Hierarchical and Compositional Dataset for Interrogating Learned Representations
Authors Jesse A. Livezey, Ahyeon Hwang, Kristofer E. Bouchard
Abstract Interpretable representations of data are useful for testing a hypothesis or to distinguish between multiple potential hypotheses about the data. In contrast, applied machine learning, and specifically deep learning (DL), is often used in contexts where performance is valued over interpretability. Indeed, deep networks (DNs) are often treated as ``black boxes’', and it is not well understood what and how they learn from a given dataset. This lack of understanding seriously hinders adoption of DNs as data analysis tools in science and poses numerous research questions. One problem is that current deep learning research datasets either have very little hierarchical structure or are too complex for their structure to be analyzed, impeding precise predictions of hierarchical representations. To address this gap, we present a benchmark dataset with known hierarchical and compositional structure and a set of methods for performing hypothesis-driven data analysis using DNs. The Hangul Fonts Dataset is composed of 35 fonts, each with 11,172 written syllables consisting of 19 initial consonants, 21 medial vowels, and 28 final consonants. The rules for combining and modifying individual Hangul characters into blocks can be encoded, with translation, scaling, and style variation that depend on precise block content, as well as naturalistic variation across fonts. Thus, the Hangul Fonts Dataset will provide an intermediate complexity dataset with well-defined, hierarchical features to interrogate learned representations. We first present a summary of the structure of the dataset. Using a set of unsupervised and supervised methods, we find that deep network representations contain structure related to the geometrical hierarchy of the characters. Our results lay the foundation for a better understanding of what deep networks learn from complex, structured datasets. |
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.13308v1
PDF https://arxiv.org/pdf/1905.13308v1.pdf
PWC https://paperswithcode.com/paper/190513308
Repo
Framework

Constrained Multi-Objective Optimization for Automated Machine Learning

Title Constrained Multi-Objective Optimization for Automated Machine Learning
Authors Steven Gardner, Oleg Golovidov, Joshua Griffin, Patrick Koch, Wayne Thompson, Brett Wujek, Yan Xu
Abstract Automated machine learning has gained a lot of attention recently. Building and selecting the right machine learning models is often a multi-objective optimization problem. General purpose machine learning software that simultaneously supports multiple objectives and constraints is scant, though the potential benefits are great. In this work, we present a framework called Autotune that effectively handles multiple objectives and constraints that arise in machine learning problems. Autotune is built on a suite of derivative-free optimization methods, and utilizes multi-level parallelism in a distributed computing environment for automatically training, scoring, and selecting good models. Incorporation of multiple objectives and constraints in the model exploration and selection process provides the flexibility needed to satisfy trade-offs necessary in practical machine learning applications. Experimental results from standard multi-objective optimization benchmark problems show that Autotune is very efficient in capturing Pareto fronts. These benchmark results also show how adding constraints can guide the search to more promising regions of the solution space, ultimately producing more desirable Pareto fronts. Results from two real-world case studies demonstrate the effectiveness of the constrained multi-objective optimization capability offered by Autotune.
Tasks
Published 2019-08-14
URL https://arxiv.org/abs/1908.04909v1
PDF https://arxiv.org/pdf/1908.04909v1.pdf
PWC https://paperswithcode.com/paper/constrained-multi-objective-optimization-for
Repo
Framework

Deep Network Embedding for Graph Representation Learning in Signed Networks

Title Deep Network Embedding for Graph Representation Learning in Signed Networks
Authors Xiao Shen, Fu-Lai Chung
Abstract Network embedding has attracted an increasing attention over the past few years. As an effective approach to solve graph mining problems, network embedding aims to learn a low-dimensional feature vector representation for each node of a given network. The vast majority of existing network embedding algorithms, however, are only designed for unsigned networks, and the signed networks containing both positive and negative links, have pretty distinct properties from the unsigned counterpart. In this paper, we propose a deep network embedding model to learn the low-dimensional node vector representations with structural balance preservation for the signed networks. The model employs a semi-supervised stacked auto-encoder to reconstruct the adjacency connections of a given signed network. As the adjacency connections are overwhelmingly positive in the real-world signed networks, we impose a larger penalty to make the auto-encoder focus more on reconstructing the scarce negative links than the abundant positive links. In addition, to preserve the structural balance property of signed networks, we design the pairwise constraints to make the positively connected nodes much closer than the negatively connected nodes in the embedding space. Based on the network representations learned by the proposed model, we conduct link sign prediction and community detection in signed networks. Extensive experimental results in real-world datasets demonstrate the superiority of the proposed model over the state-of-the-art network embedding algorithms for graph representation learning in signed networks.
Tasks Community Detection, Graph Representation Learning, Link Sign Prediction, Network Embedding, Representation Learning
Published 2019-01-07
URL http://arxiv.org/abs/1901.01718v1
PDF http://arxiv.org/pdf/1901.01718v1.pdf
PWC https://paperswithcode.com/paper/deep-network-embedding-for-graph
Repo
Framework

Question Answering based Clinical Text Structuring Using Pre-trained Language Model

Title Question Answering based Clinical Text Structuring Using Pre-trained Language Model
Authors Jiahui Qiu, Yangming Zhou, Zhiyuan Ma, Tong Ruan, Jinlin Liu, Jing Sun
Abstract Clinical text structuring is a critical and fundamental task for clinical research. Traditional methods such as taskspecific end-to-end models and pipeline models usually suffer from the lack of dataset and error propagation. In this paper, we present a question answering based clinical text structuring (QA-CTS) task to unify different specific tasks and make dataset shareable. A novel model that aims to introduce domain-specific features (e.g., clinical named entity information) into pre-trained language model is also proposed for QA-CTS task. Experimental results on Chinese pathology reports collected from Ruijing Hospital demonstrate our presented QA-CTS task is very effective to improve the performance on specific tasks. Our proposed model also competes favorably with strong baseline models in specific tasks.
Tasks Language Modelling, Question Answering
Published 2019-08-19
URL https://arxiv.org/abs/1908.06606v2
PDF https://arxiv.org/pdf/1908.06606v2.pdf
PWC https://paperswithcode.com/paper/question-answering-based-clinical-text
Repo
Framework

Impact of ASR on Alzheimer’s Disease Detection: All Errors are Equal, but Deletions are More Equal than Others

Title Impact of ASR on Alzheimer’s Disease Detection: All Errors are Equal, but Deletions are More Equal than Others
Authors Aparna Balagopalan, Ksenia Shkaruta, Jekaterina Novikova
Abstract Automatic Speech Recognition (ASR) is a critical component of any fully-automated speech-based Alzheimer’s disease (AD) detection model. However, despite years of speech recognition research, little is known about the impact of ASR performance on AD detection. In this paper, we experiment with controlled amounts of artificially generated ASR errors and investigate their influence on AD detection. We find that deletion errors affect AD detection performance the most, due to their impact on the features of syntactic complexity and discourse representation in speech. We show the trend to be generalisable across two different datasets and two different speech-related tasks. As a conclusion, we propose changing the ASR optimization functions to reflect a higher penalty for deletion errors when using ASR for AD detection.
Tasks Speech Recognition
Published 2019-04-02
URL http://arxiv.org/abs/1904.01684v2
PDF http://arxiv.org/pdf/1904.01684v2.pdf
PWC https://paperswithcode.com/paper/impact-of-asr-on-alzheimers-disease-detection
Repo
Framework

Custom Extended Sobel Filters

Title Custom Extended Sobel Filters
Authors Victor Bogdan, Cosmin Bonchiş, Ciprian Orhei
Abstract Edge detection is widely and fundamental feature used in various algorithms in computer vision to determine the edges in an image. The edge detection algorithm is used to determine the edges in an image which are further used by various algorithms from line detection to machine learning that can determine objects based on their contour. Inspired by new convolution techniques in machine learning we discuss here the idea of extending the standard Sobel kernels, which are used to compute the gradient of an image in order to find its edges. We compare the result of our custom extended filters with the results of the standard Sobel filter and other edge detection filters using different image sets and algorithms. We present statistical results regarding the custom extended Sobel filters improvements.
Tasks Edge Detection
Published 2019-09-30
URL https://arxiv.org/abs/1910.00138v1
PDF https://arxiv.org/pdf/1910.00138v1.pdf
PWC https://paperswithcode.com/paper/custom-extended-sobel-filters
Repo
Framework

A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis

Title A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis
Authors Jorge Agnese, Jonathan Herrera, Haicheng Tao, Xingquan Zhu
Abstract Text-to-image synthesis refers to computational methods which translate human written textual descriptions, in the form of keywords or sentences, into images with similar semantic meaning to the text. In earlier research, image synthesis relied mainly on word to image correlation analysis combined with supervised methods to find best alignment of the visual content matching to the text. Recent progress in deep learning (DL) has brought a new set of unsupervised deep learning methods, particularly deep generative models which are able to generate realistic visual images using suitably trained neural network models. In this paper, we review the most recent development in the text-to-image synthesis research domain. Our survey first introduces image synthesis and its challenges, and then reviews key concepts such as generative adversarial networks (GANs) and deep convolutional encoder-decoder neural networks (DCNN). After that, we propose a taxonomy to summarize GAN based text-to-image synthesis into four major categories: Semantic Enhancement GANs, Resolution Enhancement GANs, Diversity Enhancement GANS, and Motion Enhancement GANs. We elaborate the main objective of each group, and further review typical GAN architectures in each group. The taxonomy and the review outline the techniques and the evolution of different approaches, and eventually provide a clear roadmap to summarize the list of contemporaneous solutions that utilize GANs and DCNNs to generate enthralling results in categories such as human faces, birds, flowers, room interiors, object reconstruction from edge maps (games) etc. The survey will conclude with a comparison of the proposed solutions, challenges that remain unresolved, and future developments in the text-to-image synthesis domain.
Tasks Image Generation, Object Reconstruction
Published 2019-10-21
URL https://arxiv.org/abs/1910.09399v1
PDF https://arxiv.org/pdf/1910.09399v1.pdf
PWC https://paperswithcode.com/paper/a-survey-and-taxonomy-of-adversarial-neural
Repo
Framework

A Correctness Result for Synthesizing Plans With Loops in Stochastic Domains

Title A Correctness Result for Synthesizing Plans With Loops in Stochastic Domains
Authors Laszlo Treszkai, Vaishak Belle
Abstract Finite-state controllers (FSCs), such as plans with loops, are powerful and compact representations of action selection widely used in robotics, video games and logistics. There has been steady progress on synthesizing FSCs in deterministic environments, but the algorithmic machinery needed for lifting such techniques to stochastic environments is not yet fully understood. While the derivation of FSCs has received some attention in the context of discounted expected reward measures, they are often solved approximately and/or without correctness guarantees. In essence, that makes it difficult to analyze fundamental concerns such as: do all paths terminate, and do the majority of paths reach a goal state? In this paper, we present new theoretical results on a generic technique for synthesizing FSCs in stochastic environments, allowing for highly granular specifications on termination and goal satisfaction.
Tasks
Published 2019-05-16
URL https://arxiv.org/abs/1905.07028v1
PDF https://arxiv.org/pdf/1905.07028v1.pdf
PWC https://paperswithcode.com/paper/a-correctness-result-for-synthesizing-plans
Repo
Framework
comments powered by Disqus