October 16, 2019

3223 words 16 mins read

Paper Group ANR 1117

Paper Group ANR 1117

BPE and CharCNNs for Translation of Morphology: A Cross-Lingual Comparison and Analysis. PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks. OptStream: Releasing Time Series Privately. Image Captioning based on Deep Reinforcement Learning. Towards End-to-End Code-Switching Speech Recognition. A Progressively-trained …

BPE and CharCNNs for Translation of Morphology: A Cross-Lingual Comparison and Analysis

Title BPE and CharCNNs for Translation of Morphology: A Cross-Lingual Comparison and Analysis
Authors Pamela Shapiro, Kevin Duh
Abstract Neural Machine Translation (NMT) in low-resource settings and of morphologically rich languages is made difficult in part by data sparsity of vocabulary words. Several methods have been used to help reduce this sparsity, notably Byte-Pair Encoding (BPE) and a character-based CNN layer (charCNN). However, the charCNN has largely been neglected, possibly because it has only been compared to BPE rather than combined with it. We argue for a reconsideration of the charCNN, based on cross-lingual improvements on low-resource data. We translate from 8 languages into English, using a multi-way parallel collection of TED transcripts. We find that in most cases, using both BPE and a charCNN performs best, while in Hebrew, using a charCNN over words is best.
Tasks Machine Translation
Published 2018-09-05
URL http://arxiv.org/abs/1809.01301v2
PDF http://arxiv.org/pdf/1809.01301v2.pdf
PWC https://paperswithcode.com/paper/bpe-and-charcnns-for-translation-of
Repo
Framework

PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks

Title PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks
Authors Muhammad Asad, Rilwan Basaru, S M Masudur Rahman Al Arif, Greg Slabaugh
Abstract Recently, Convolutional Neural Networks (CNNs) have dominated the field of computer vision. Their widespread success has been attributed to their representation learning capabilities. For classification tasks, CNNs have widely employed probabilistic output and have shown the significance of providing additional confidence for predictions. However, such probabilistic methodologies are not widely applicable for addressing regression problems using CNNs, as regression involves learning unconstrained continuous and, in many cases, multi-variate target variables. We propose a PRObabilistic Parametric rEgression Loss (PROPEL) that enables probabilistic regression using CNNs. PROPEL is fully differentiable and, hence, can be easily incorporated for end-to-end training of existing regressive CNN architectures. The proposed method is flexible as it learns complex unconstrained probabilities while being generalizable to higher dimensional multi-variate regression problems. We utilize a PROPEL-based CNN to address the problem of learning hand and head orientation from uncalibrated color images. Comprehensive experimental validation and comparisons with existing CNN regression loss functions are provided. Our experimental results indicate that PROPEL significantly improves the performance of a CNN, while reducing model parameters by 10x as compared to the existing state-of-the-art.
Tasks Representation Learning
Published 2018-07-28
URL http://arxiv.org/abs/1807.10937v1
PDF http://arxiv.org/pdf/1807.10937v1.pdf
PWC https://paperswithcode.com/paper/propel-probabilistic-parametric-regression
Repo
Framework

OptStream: Releasing Time Series Privately

Title OptStream: Releasing Time Series Privately
Authors Ferdinando Fioretto, Pascal Van Hentenryck
Abstract Many applications of machine learning and optimization operate on data streams. While these datasets are fundamental to fuel decision-making algorithms, often they contain sensitive information about individuals and their usage poses significant privacy risks. Motivated by an application in energy systems, this paper presents OPTSTREAM, a novel algorithm for releasing differentially private data streams under the w-event model of privacy. OPTSTREAM is a 4-step procedure consisting of sampling, perturbation, reconstruction, and post-processing modules. First, the sampling module selects a small set of points to access in each period of interest. Then, the perturbation module adds noise to the sampled data points to guarantee privacy. Next, the reconstruction module reassembles non-sampled data points from the perturbed sample points. Finally, the post-processing module uses convex optimization over the private output of the previous modules, as well as the private answers of additional queries on the data stream, to improve accuracy by redistributing the added noise. OPTSTREAM is evaluated on a test case involving the release of a real data stream from the largest European transmission operator. Experimental results show that OPTSTREAM may not only improve the accuracy of state-of-the-art methods by at least one order of magnitude but also supports accurate load forecasting on the private data.
Tasks Decision Making, Load Forecasting, Time Series
Published 2018-08-06
URL http://arxiv.org/abs/1808.01949v2
PDF http://arxiv.org/pdf/1808.01949v2.pdf
PWC https://paperswithcode.com/paper/180801949
Repo
Framework

Image Captioning based on Deep Reinforcement Learning

Title Image Captioning based on Deep Reinforcement Learning
Authors Haichao Shi, Peng Li, Bo Wang, Zhenyu Wang
Abstract Recently it has shown that the policy-gradient methods for reinforcement learning have been utilized to train deep end-to-end systems on natural language processing tasks. What’s more, with the complexity of understanding image content and diverse ways of describing image content in natural language, image captioning has been a challenging problem to deal with. To the best of our knowledge, most state-of-the-art methods follow a pattern of sequential model, such as recurrent neural networks (RNN). However, in this paper, we propose a novel architecture for image captioning with deep reinforcement learning to optimize image captioning tasks. We utilize two networks called “policy network” and “value network” to collaboratively generate the captions of images. The experiments are conducted on Microsoft COCO dataset, and the experimental results have verified the effectiveness of the proposed method.
Tasks Image Captioning, Policy Gradient Methods
Published 2018-09-13
URL http://arxiv.org/abs/1809.04835v1
PDF http://arxiv.org/pdf/1809.04835v1.pdf
PWC https://paperswithcode.com/paper/image-captioning-based-on-deep-reinforcement
Repo
Framework

Towards End-to-End Code-Switching Speech Recognition

Title Towards End-to-End Code-Switching Speech Recognition
Authors Ne Luo, Dongwei Jiang, Shuaijiang Zhao, Caixia Gong, Wei Zou, Xiangang Li
Abstract Code-switching speech recognition has attracted an increasing interest recently, but the need for expert linguistic knowledge has always been a big issue. End-to-end automatic speech recognition (ASR) simplifies the building of ASR systems considerably by predicting graphemes or characters directly from acoustic input. In the mean time, the need of expert linguistic knowledge is also eliminated, which makes it an attractive choice for code-switching ASR. This paper presents a hybrid CTC-Attention based end-to-end Mandarin-English code-switching (CS) speech recognition system and studies the effect of hybrid CTC-Attention based models, different modeling units, the inclusion of language identification and different decoding strategies on the task of code-switching ASR. On the SEAME corpus, our system achieves a mixed error rate (MER) of 34.24%.
Tasks Language Identification, Speech Recognition
Published 2018-10-31
URL http://arxiv.org/abs/1810.13091v2
PDF http://arxiv.org/pdf/1810.13091v2.pdf
PWC https://paperswithcode.com/paper/towards-end-to-end-code-switching-speech
Repo
Framework

A Progressively-trained Scale-invariant and Boundary-aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions

Title A Progressively-trained Scale-invariant and Boundary-aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions
Authors Bo Zhou, Randolph Crawford, Belma Dogdas, Gregory Goldmacher, Antong Chen
Abstract Volumetric segmentation of lesions on CT scans is important for many types of analysis, including lesion growth kinetic modeling in clinical trials and machine learning of radiomic features. Manual segmentation is laborious, and impractical for large-scale use. For routine clinical use, and in clinical trials that apply the Response Evaluation Criteria In Solid Tumors (RECIST), clinicians typically outline the boundaries of a lesion on a single slice to extract diameter measurements. In this work, we have collected a large-scale database, named LesionVis, with pixel-wise manual 2D lesion delineations on the RECIST-slices. To extend the 2D segmentations to 3D, we propose a volumetric progressive lesion segmentation (PLS) algorithm to automatically segment the 3D lesion volume from 2D delineations using a scale-invariant and boundary-aware deep convolutional network (SIBA-Net). The SIBA-Net copes with the size transition of a lesion when the PLS progresses from the RECIST-slice to the edge-slices, as well as when performing longitudinal assessment of lesions whose size change over multiple time points. The proposed PLS-SiBA-Net (P-SiBA) approach is assessed on the lung lesion cases from LesionVis. Our experimental results demonstrate that the P-SiBA approach achieves mean Dice similarity coefficients (DSC) of 0.81, which significantly improves 3D segmentation accuracy compared with the approaches proposed previously (highest mean DSC at 0.78 on LesionVis). In summary, by leveraging the limited 2D delineations on the RECIST-slices, P-SiBA is an effective semi-supervised approach to produce accurate lesion segmentations in 3D.
Tasks Lesion Segmentation
Published 2018-11-11
URL http://arxiv.org/abs/1811.04437v1
PDF http://arxiv.org/pdf/1811.04437v1.pdf
PWC https://paperswithcode.com/paper/a-progressively-trained-scale-invariant-and
Repo
Framework

Mining on Manifolds: Metric Learning without Labels

Title Mining on Manifolds: Metric Learning without Labels
Authors Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondrej Chum
Abstract In this work we present a novel unsupervised framework for hard training example mining. The only input to the method is a collection of images relevant to the target application and a meaningful initial representation, provided e.g. by pre-trained CNN. Positive examples are distant points on a single manifold, while negative examples are nearby points on different manifolds. Both types of examples are revealed by disagreements between Euclidean and manifold similarities. The discovered examples can be used in training with any discriminative loss. The method is applied to unsupervised fine-tuning of pre-trained networks for fine-grained classification and particular object retrieval. Our models are on par or are outperforming prior models that are fully or partially supervised.
Tasks Metric Learning
Published 2018-03-29
URL http://arxiv.org/abs/1803.11095v1
PDF http://arxiv.org/pdf/1803.11095v1.pdf
PWC https://paperswithcode.com/paper/mining-on-manifolds-metric-learning-without
Repo
Framework

Towards Robot-Centric Conceptual Knowledge Acquisition

Title Towards Robot-Centric Conceptual Knowledge Acquisition
Authors Georg Jäger, Christian A. Mueller, Madhura Thosar, Sebastian Zug, Andreas Birk
Abstract Robots require knowledge about objects in order to efficiently perform various household tasks involving objects. The existing knowledge bases for robots acquire symbolic knowledge about objects from manually-coded external common sense knowledge bases such as ConceptNet, Word-Net etc. The problem with such approaches is the discrepancy between human-centric symbolic knowledge and robot-centric object perception due to its limited perception capabilities. Ultimately, significant portion of knowledge in the knowledge base remains ungrounded into robot’s perception. To overcome this discrepancy, we propose an approach to enable robots to generate robot-centric symbolic knowledge about objects from their own sensory data, thus, allowing them to assemble their own conceptual understanding of objects. With this goal in mind, the presented paper elaborates on the work-in-progress of the proposed approach followed by the preliminary results.
Tasks Common Sense Reasoning
Published 2018-10-08
URL http://arxiv.org/abs/1810.03583v1
PDF http://arxiv.org/pdf/1810.03583v1.pdf
PWC https://paperswithcode.com/paper/towards-robot-centric-conceptual-knowledge
Repo
Framework

Super-resolution Ultrasound Localization Microscopy through Deep Learning

Title Super-resolution Ultrasound Localization Microscopy through Deep Learning
Authors Ruud J. G. van Sloun, Oren Solomon, Matthew Bruce, Zin Z. Khaing, Hessel Wijkstra, Yonina C. Eldar, Massimo Mischi
Abstract Ultrasound localization microscopy has enabled super-resolution vascular imaging through precise localization of individual ultrasound contrast agents (microbubbles) across numerous imaging frames. However, analysis of high-density regions with significant overlaps among the microbubble point spread responses yields high localization errors, constraining the technique to low-concentration conditions. As such, long acquisition times are required to sufficiently cover the vascular bed. In this work, we present a fast and precise method for obtaining super-resolution vascular images from high-density contrast-enhanced ultrasound imaging data. This method, which we term Deep Ultrasound Localization Microscopy (Deep-ULM), exploits modern deep learning strategies and employs a convolutional neural network to perform localization microscopy in dense scenarios. This end-to-end fully convolutional neural network architecture is trained effectively using on-line synthesized data, enabling robust inference in-vivo under a wide variety of imaging conditions. We show that deep learning attains super-resolution with challenging contrast-agent densities, both in-silico as well as in-vivo. Deep-ULM is suitable for real-time applications, resolving about 70 high-resolution patches (128x128 pixels) per second on a standard PC. Exploiting GPU computation, this number increases to 1250 patches per second.
Tasks Super-Resolution
Published 2018-04-20
URL http://arxiv.org/abs/1804.07661v2
PDF http://arxiv.org/pdf/1804.07661v2.pdf
PWC https://paperswithcode.com/paper/super-resolution-ultrasound-localization
Repo
Framework

Product Title Refinement via Multi-Modal Generative Adversarial Learning

Title Product Title Refinement via Multi-Modal Generative Adversarial Learning
Authors Jianguo Zhang, Pengcheng Zou, Zhao Li, Yao Wan, Ye Liu, Xiuming Pan, Yu Gong, Philip S. Yu
Abstract Nowadays, an increasing number of customers are in favor of using E-commerce Apps to browse and purchase products. Since merchants are usually inclined to employ redundant and over-informative product titles to attract customers’ attention, it is of great importance to concisely display short product titles on limited screen of cell phones. Previous researchers mainly consider textual information of long product titles and lack of human-like view during training and evaluation procedure. In this paper, we propose a Multi-Modal Generative Adversarial Network (MM-GAN) for short product title generation, which innovatively incorporates image information, attribute tags from the product and the textual information from original long titles. MM-GAN treats short titles generation as a reinforcement learning process, where the generated titles are evaluated by the discriminator in a human-like view.
Tasks
Published 2018-11-11
URL http://arxiv.org/abs/1811.04498v1
PDF http://arxiv.org/pdf/1811.04498v1.pdf
PWC https://paperswithcode.com/paper/product-title-refinement-via-multi-modal
Repo
Framework

Defending Against Adversarial Attacks by Leveraging an Entire GAN

Title Defending Against Adversarial Attacks by Leveraging an Entire GAN
Authors Gokula Krishnan Santhanam, Paulina Grnarova
Abstract Recent work has shown that state-of-the-art models are highly vulnerable to adversarial perturbations of the input. We propose cowboy, an approach to detecting and defending against adversarial attacks by using both the discriminator and generator of a GAN trained on the same dataset. We show that the discriminator consistently scores the adversarial samples lower than the real samples across multiple attacks and datasets. We provide empirical evidence that adversarial samples lie outside of the data manifold learned by the GAN. Based on this, we propose a cleaning method which uses both the discriminator and generator of the GAN to project the samples back onto the data manifold. This cleaning procedure is independent of the classifier and type of attack and thus can be deployed in existing systems.
Tasks
Published 2018-05-27
URL http://arxiv.org/abs/1805.10652v1
PDF http://arxiv.org/pdf/1805.10652v1.pdf
PWC https://paperswithcode.com/paper/defending-against-adversarial-attacks-by
Repo
Framework

Conditional Image-to-Image Translation

Title Conditional Image-to-Image Translation
Authors Jianxin Lin, Yingce Xia, Tao Qin, Zhibo Chen, Tie-Yan Liu
Abstract Image-to-image translation tasks have been widely investigated with Generative Adversarial Networks (GANs) and dual learning. However, existing models lack the ability to control the translated results in the target domain and their results usually lack of diversity in the sense that a fixed image usually leads to (almost) deterministic translation result. In this paper, we study a new problem, conditional image-to-image translation, which is to translate an image from the source domain to the target domain conditioned on a given image in the target domain. It requires that the generated image should inherit some domain-specific features of the conditional image from the target domain. Therefore, changing the conditional image in the target domain will lead to diverse translation results for a fixed input image from the source domain, and therefore the conditional input image helps to control the translation results. We tackle this problem with unpaired data based on GANs and dual learning. We twist two conditional translation models (one translation from A domain to B domain, and the other one from B domain to A domain) together for inputs combination and reconstruction while preserving domain independent features. We carry out experiments on men’s faces from-to women’s faces translation and edges to shoes&bags translations. The results demonstrate the effectiveness of our proposed method.
Tasks Image-to-Image Translation
Published 2018-05-01
URL http://arxiv.org/abs/1805.00251v1
PDF http://arxiv.org/pdf/1805.00251v1.pdf
PWC https://paperswithcode.com/paper/conditional-image-to-image-translation
Repo
Framework

Semantic Adversarial Deep Learning

Title Semantic Adversarial Deep Learning
Authors Tommaso Dreossi, Somesh Jha, Sanjit A. Seshia
Abstract Fueled by massive amounts of data, models produced by machine-learning (ML) algorithms, especially deep neural networks, are being used in diverse domains where trustworthiness is a concern, including automotive systems, finance, health care, natural language processing, and malware detection. Of particular concern is the use of ML algorithms in cyber-physical systems (CPS), such as self-driving cars and aviation, where an adversary can cause serious consequences. However, existing approaches to generating adversarial examples and devising robust ML algorithms mostly ignore the semantics and context of the overall system containing the ML component. For example, in an autonomous vehicle using deep learning for perception, not every adversarial example for the neural network might lead to a harmful consequence. Moreover, one may want to prioritize the search for adversarial examples towards those that significantly modify the desired semantics of the overall system. Along the same lines, existing algorithms for constructing robust ML algorithms ignore the specification of the overall system. In this paper, we argue that the semantics and specification of the overall system has a crucial role to play in this line of research. We present preliminary research results that support this claim.
Tasks Malware Detection, Self-Driving Cars
Published 2018-04-19
URL http://arxiv.org/abs/1804.07045v2
PDF http://arxiv.org/pdf/1804.07045v2.pdf
PWC https://paperswithcode.com/paper/semantic-adversarial-deep-learning
Repo
Framework

WebEye - Automated Collection of Malicious HTTP Traffic

Title WebEye - Automated Collection of Malicious HTTP Traffic
Authors Johann Vierthaler, Roman Kruszelnicki, Julian Schütte
Abstract With malware detection techniques increasingly adopting machine learning approaches, the creation of precise training sets becomes more and more important. Large data sets of realistic web traffic, correctly classified as benign or malicious are needed, not only to train classic and deep learning algorithms, but also to serve as evaluation benchmarks for existing malware detection products. Interestingly, despite the vast number and versatility of threats a user may encounter when browsing the web, actual malicious content is often hard to come by, since prerequisites such as browser and operating system type and version must be met in order to receive the payload from a malware distributing server. In combination with privacy constraints on data sets of actual user traffic, it is difficult for researchers and product developers to evaluate anti-malware solutions against large-scale data sets of realistic web traffic. In this paper we present WebEye, a framework that autonomously creates realistic HTTP traffic, enriches recorded traffic with additional information, and classifies records as malicious or benign, using different classifiers. We are using WebEye to collect malicious HTML and JavaScript and show how datasets created with WebEye can be used to train machine learning based malware detection algorithms. We regard WebEye and the data sets it creates as a tool for researchers and product developers to evaluate and improve their AI-based anti-malware solutions against large-scale benchmarks.
Tasks Malware Detection
Published 2018-02-16
URL http://arxiv.org/abs/1802.06012v1
PDF http://arxiv.org/pdf/1802.06012v1.pdf
PWC https://paperswithcode.com/paper/webeye-automated-collection-of-malicious-http
Repo
Framework

NtMalDetect: A Machine Learning Approach to Malware Detection Using Native API System Calls

Title NtMalDetect: A Machine Learning Approach to Malware Detection Using Native API System Calls
Authors Chan Woo Kim
Abstract As computing systems become increasingly advanced and as users increasingly engage themselves in technology, security has never been a greater concern. In malware detection, static analysis, the method of analyzing potentially malicious files, has been the prominent approach. This approach, however, quickly falls short as malicious programs become more advanced and adopt the capabilities of obfuscating its binaries to execute the same malicious functions, making static analysis extremely difficult for newer variants. The approach assessed in this paper is a novel dynamic malware analysis method, which may generalize better than static analysis to newer variants. Inspired by recent successes in Natural Language Processing (NLP), widely used document classification techniques were assessed in detecting malware by doing such analysis on system calls, which contain useful information about the operation of a program as requests that the program makes of the kernel. Features considered are extracted from system call traces of benign and malicious programs, and the task to classify these traces is treated as a binary document classification task of system call traces. The system call traces were processed to remove the parameters to only leave the system call function names. The features were grouped into various n-grams and weighted with Term Frequency-Inverse Document Frequency. This paper shows that Linear Support Vector Machines (SVM) optimized by Stochastic Gradient Descent and the traditional Coordinate Descent on the Wolfe Dual form of the SVM are effective in this approach, achieving a highest of 96% accuracy with 95% recall score. Additional contributions include the identification of significant system call sequences that could be avenues for further research.
Tasks Document Classification, Malware Detection
Published 2018-02-15
URL http://arxiv.org/abs/1802.05412v2
PDF http://arxiv.org/pdf/1802.05412v2.pdf
PWC https://paperswithcode.com/paper/ntmaldetect-a-machine-learning-approach-to
Repo
Framework
comments powered by Disqus