Paper Group ANR 1117
BPE and CharCNNs for Translation of Morphology: A Cross-Lingual Comparison and Analysis. PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks. OptStream: Releasing Time Series Privately. Image Captioning based on Deep Reinforcement Learning. Towards End-to-End Code-Switching Speech Recognition. A Progressively-trained …
BPE and CharCNNs for Translation of Morphology: A Cross-Lingual Comparison and Analysis
Title | BPE and CharCNNs for Translation of Morphology: A Cross-Lingual Comparison and Analysis |
Authors | Pamela Shapiro, Kevin Duh |
Abstract | Neural Machine Translation (NMT) in low-resource settings and of morphologically rich languages is made difficult in part by data sparsity of vocabulary words. Several methods have been used to help reduce this sparsity, notably Byte-Pair Encoding (BPE) and a character-based CNN layer (charCNN). However, the charCNN has largely been neglected, possibly because it has only been compared to BPE rather than combined with it. We argue for a reconsideration of the charCNN, based on cross-lingual improvements on low-resource data. We translate from 8 languages into English, using a multi-way parallel collection of TED transcripts. We find that in most cases, using both BPE and a charCNN performs best, while in Hebrew, using a charCNN over words is best. |
Tasks | Machine Translation |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01301v2 |
http://arxiv.org/pdf/1809.01301v2.pdf | |
PWC | https://paperswithcode.com/paper/bpe-and-charcnns-for-translation-of |
Repo | |
Framework | |
PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks
Title | PROPEL: Probabilistic Parametric Regression Loss for Convolutional Neural Networks |
Authors | Muhammad Asad, Rilwan Basaru, S M Masudur Rahman Al Arif, Greg Slabaugh |
Abstract | Recently, Convolutional Neural Networks (CNNs) have dominated the field of computer vision. Their widespread success has been attributed to their representation learning capabilities. For classification tasks, CNNs have widely employed probabilistic output and have shown the significance of providing additional confidence for predictions. However, such probabilistic methodologies are not widely applicable for addressing regression problems using CNNs, as regression involves learning unconstrained continuous and, in many cases, multi-variate target variables. We propose a PRObabilistic Parametric rEgression Loss (PROPEL) that enables probabilistic regression using CNNs. PROPEL is fully differentiable and, hence, can be easily incorporated for end-to-end training of existing regressive CNN architectures. The proposed method is flexible as it learns complex unconstrained probabilities while being generalizable to higher dimensional multi-variate regression problems. We utilize a PROPEL-based CNN to address the problem of learning hand and head orientation from uncalibrated color images. Comprehensive experimental validation and comparisons with existing CNN regression loss functions are provided. Our experimental results indicate that PROPEL significantly improves the performance of a CNN, while reducing model parameters by 10x as compared to the existing state-of-the-art. |
Tasks | Representation Learning |
Published | 2018-07-28 |
URL | http://arxiv.org/abs/1807.10937v1 |
http://arxiv.org/pdf/1807.10937v1.pdf | |
PWC | https://paperswithcode.com/paper/propel-probabilistic-parametric-regression |
Repo | |
Framework | |
OptStream: Releasing Time Series Privately
Title | OptStream: Releasing Time Series Privately |
Authors | Ferdinando Fioretto, Pascal Van Hentenryck |
Abstract | Many applications of machine learning and optimization operate on data streams. While these datasets are fundamental to fuel decision-making algorithms, often they contain sensitive information about individuals and their usage poses significant privacy risks. Motivated by an application in energy systems, this paper presents OPTSTREAM, a novel algorithm for releasing differentially private data streams under the w-event model of privacy. OPTSTREAM is a 4-step procedure consisting of sampling, perturbation, reconstruction, and post-processing modules. First, the sampling module selects a small set of points to access in each period of interest. Then, the perturbation module adds noise to the sampled data points to guarantee privacy. Next, the reconstruction module reassembles non-sampled data points from the perturbed sample points. Finally, the post-processing module uses convex optimization over the private output of the previous modules, as well as the private answers of additional queries on the data stream, to improve accuracy by redistributing the added noise. OPTSTREAM is evaluated on a test case involving the release of a real data stream from the largest European transmission operator. Experimental results show that OPTSTREAM may not only improve the accuracy of state-of-the-art methods by at least one order of magnitude but also supports accurate load forecasting on the private data. |
Tasks | Decision Making, Load Forecasting, Time Series |
Published | 2018-08-06 |
URL | http://arxiv.org/abs/1808.01949v2 |
http://arxiv.org/pdf/1808.01949v2.pdf | |
PWC | https://paperswithcode.com/paper/180801949 |
Repo | |
Framework | |
Image Captioning based on Deep Reinforcement Learning
Title | Image Captioning based on Deep Reinforcement Learning |
Authors | Haichao Shi, Peng Li, Bo Wang, Zhenyu Wang |
Abstract | Recently it has shown that the policy-gradient methods for reinforcement learning have been utilized to train deep end-to-end systems on natural language processing tasks. What’s more, with the complexity of understanding image content and diverse ways of describing image content in natural language, image captioning has been a challenging problem to deal with. To the best of our knowledge, most state-of-the-art methods follow a pattern of sequential model, such as recurrent neural networks (RNN). However, in this paper, we propose a novel architecture for image captioning with deep reinforcement learning to optimize image captioning tasks. We utilize two networks called “policy network” and “value network” to collaboratively generate the captions of images. The experiments are conducted on Microsoft COCO dataset, and the experimental results have verified the effectiveness of the proposed method. |
Tasks | Image Captioning, Policy Gradient Methods |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.04835v1 |
http://arxiv.org/pdf/1809.04835v1.pdf | |
PWC | https://paperswithcode.com/paper/image-captioning-based-on-deep-reinforcement |
Repo | |
Framework | |
Towards End-to-End Code-Switching Speech Recognition
Title | Towards End-to-End Code-Switching Speech Recognition |
Authors | Ne Luo, Dongwei Jiang, Shuaijiang Zhao, Caixia Gong, Wei Zou, Xiangang Li |
Abstract | Code-switching speech recognition has attracted an increasing interest recently, but the need for expert linguistic knowledge has always been a big issue. End-to-end automatic speech recognition (ASR) simplifies the building of ASR systems considerably by predicting graphemes or characters directly from acoustic input. In the mean time, the need of expert linguistic knowledge is also eliminated, which makes it an attractive choice for code-switching ASR. This paper presents a hybrid CTC-Attention based end-to-end Mandarin-English code-switching (CS) speech recognition system and studies the effect of hybrid CTC-Attention based models, different modeling units, the inclusion of language identification and different decoding strategies on the task of code-switching ASR. On the SEAME corpus, our system achieves a mixed error rate (MER) of 34.24%. |
Tasks | Language Identification, Speech Recognition |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1810.13091v2 |
http://arxiv.org/pdf/1810.13091v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-end-to-end-code-switching-speech |
Repo | |
Framework | |
A Progressively-trained Scale-invariant and Boundary-aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions
Title | A Progressively-trained Scale-invariant and Boundary-aware Deep Neural Network for the Automatic 3D Segmentation of Lung Lesions |
Authors | Bo Zhou, Randolph Crawford, Belma Dogdas, Gregory Goldmacher, Antong Chen |
Abstract | Volumetric segmentation of lesions on CT scans is important for many types of analysis, including lesion growth kinetic modeling in clinical trials and machine learning of radiomic features. Manual segmentation is laborious, and impractical for large-scale use. For routine clinical use, and in clinical trials that apply the Response Evaluation Criteria In Solid Tumors (RECIST), clinicians typically outline the boundaries of a lesion on a single slice to extract diameter measurements. In this work, we have collected a large-scale database, named LesionVis, with pixel-wise manual 2D lesion delineations on the RECIST-slices. To extend the 2D segmentations to 3D, we propose a volumetric progressive lesion segmentation (PLS) algorithm to automatically segment the 3D lesion volume from 2D delineations using a scale-invariant and boundary-aware deep convolutional network (SIBA-Net). The SIBA-Net copes with the size transition of a lesion when the PLS progresses from the RECIST-slice to the edge-slices, as well as when performing longitudinal assessment of lesions whose size change over multiple time points. The proposed PLS-SiBA-Net (P-SiBA) approach is assessed on the lung lesion cases from LesionVis. Our experimental results demonstrate that the P-SiBA approach achieves mean Dice similarity coefficients (DSC) of 0.81, which significantly improves 3D segmentation accuracy compared with the approaches proposed previously (highest mean DSC at 0.78 on LesionVis). In summary, by leveraging the limited 2D delineations on the RECIST-slices, P-SiBA is an effective semi-supervised approach to produce accurate lesion segmentations in 3D. |
Tasks | Lesion Segmentation |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04437v1 |
http://arxiv.org/pdf/1811.04437v1.pdf | |
PWC | https://paperswithcode.com/paper/a-progressively-trained-scale-invariant-and |
Repo | |
Framework | |
Mining on Manifolds: Metric Learning without Labels
Title | Mining on Manifolds: Metric Learning without Labels |
Authors | Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondrej Chum |
Abstract | In this work we present a novel unsupervised framework for hard training example mining. The only input to the method is a collection of images relevant to the target application and a meaningful initial representation, provided e.g. by pre-trained CNN. Positive examples are distant points on a single manifold, while negative examples are nearby points on different manifolds. Both types of examples are revealed by disagreements between Euclidean and manifold similarities. The discovered examples can be used in training with any discriminative loss. The method is applied to unsupervised fine-tuning of pre-trained networks for fine-grained classification and particular object retrieval. Our models are on par or are outperforming prior models that are fully or partially supervised. |
Tasks | Metric Learning |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.11095v1 |
http://arxiv.org/pdf/1803.11095v1.pdf | |
PWC | https://paperswithcode.com/paper/mining-on-manifolds-metric-learning-without |
Repo | |
Framework | |
Towards Robot-Centric Conceptual Knowledge Acquisition
Title | Towards Robot-Centric Conceptual Knowledge Acquisition |
Authors | Georg Jäger, Christian A. Mueller, Madhura Thosar, Sebastian Zug, Andreas Birk |
Abstract | Robots require knowledge about objects in order to efficiently perform various household tasks involving objects. The existing knowledge bases for robots acquire symbolic knowledge about objects from manually-coded external common sense knowledge bases such as ConceptNet, Word-Net etc. The problem with such approaches is the discrepancy between human-centric symbolic knowledge and robot-centric object perception due to its limited perception capabilities. Ultimately, significant portion of knowledge in the knowledge base remains ungrounded into robot’s perception. To overcome this discrepancy, we propose an approach to enable robots to generate robot-centric symbolic knowledge about objects from their own sensory data, thus, allowing them to assemble their own conceptual understanding of objects. With this goal in mind, the presented paper elaborates on the work-in-progress of the proposed approach followed by the preliminary results. |
Tasks | Common Sense Reasoning |
Published | 2018-10-08 |
URL | http://arxiv.org/abs/1810.03583v1 |
http://arxiv.org/pdf/1810.03583v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-robot-centric-conceptual-knowledge |
Repo | |
Framework | |
Super-resolution Ultrasound Localization Microscopy through Deep Learning
Title | Super-resolution Ultrasound Localization Microscopy through Deep Learning |
Authors | Ruud J. G. van Sloun, Oren Solomon, Matthew Bruce, Zin Z. Khaing, Hessel Wijkstra, Yonina C. Eldar, Massimo Mischi |
Abstract | Ultrasound localization microscopy has enabled super-resolution vascular imaging through precise localization of individual ultrasound contrast agents (microbubbles) across numerous imaging frames. However, analysis of high-density regions with significant overlaps among the microbubble point spread responses yields high localization errors, constraining the technique to low-concentration conditions. As such, long acquisition times are required to sufficiently cover the vascular bed. In this work, we present a fast and precise method for obtaining super-resolution vascular images from high-density contrast-enhanced ultrasound imaging data. This method, which we term Deep Ultrasound Localization Microscopy (Deep-ULM), exploits modern deep learning strategies and employs a convolutional neural network to perform localization microscopy in dense scenarios. This end-to-end fully convolutional neural network architecture is trained effectively using on-line synthesized data, enabling robust inference in-vivo under a wide variety of imaging conditions. We show that deep learning attains super-resolution with challenging contrast-agent densities, both in-silico as well as in-vivo. Deep-ULM is suitable for real-time applications, resolving about 70 high-resolution patches (128x128 pixels) per second on a standard PC. Exploiting GPU computation, this number increases to 1250 patches per second. |
Tasks | Super-Resolution |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07661v2 |
http://arxiv.org/pdf/1804.07661v2.pdf | |
PWC | https://paperswithcode.com/paper/super-resolution-ultrasound-localization |
Repo | |
Framework | |
Product Title Refinement via Multi-Modal Generative Adversarial Learning
Title | Product Title Refinement via Multi-Modal Generative Adversarial Learning |
Authors | Jianguo Zhang, Pengcheng Zou, Zhao Li, Yao Wan, Ye Liu, Xiuming Pan, Yu Gong, Philip S. Yu |
Abstract | Nowadays, an increasing number of customers are in favor of using E-commerce Apps to browse and purchase products. Since merchants are usually inclined to employ redundant and over-informative product titles to attract customers’ attention, it is of great importance to concisely display short product titles on limited screen of cell phones. Previous researchers mainly consider textual information of long product titles and lack of human-like view during training and evaluation procedure. In this paper, we propose a Multi-Modal Generative Adversarial Network (MM-GAN) for short product title generation, which innovatively incorporates image information, attribute tags from the product and the textual information from original long titles. MM-GAN treats short titles generation as a reinforcement learning process, where the generated titles are evaluated by the discriminator in a human-like view. |
Tasks | |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04498v1 |
http://arxiv.org/pdf/1811.04498v1.pdf | |
PWC | https://paperswithcode.com/paper/product-title-refinement-via-multi-modal |
Repo | |
Framework | |
Defending Against Adversarial Attacks by Leveraging an Entire GAN
Title | Defending Against Adversarial Attacks by Leveraging an Entire GAN |
Authors | Gokula Krishnan Santhanam, Paulina Grnarova |
Abstract | Recent work has shown that state-of-the-art models are highly vulnerable to adversarial perturbations of the input. We propose cowboy, an approach to detecting and defending against adversarial attacks by using both the discriminator and generator of a GAN trained on the same dataset. We show that the discriminator consistently scores the adversarial samples lower than the real samples across multiple attacks and datasets. We provide empirical evidence that adversarial samples lie outside of the data manifold learned by the GAN. Based on this, we propose a cleaning method which uses both the discriminator and generator of the GAN to project the samples back onto the data manifold. This cleaning procedure is independent of the classifier and type of attack and thus can be deployed in existing systems. |
Tasks | |
Published | 2018-05-27 |
URL | http://arxiv.org/abs/1805.10652v1 |
http://arxiv.org/pdf/1805.10652v1.pdf | |
PWC | https://paperswithcode.com/paper/defending-against-adversarial-attacks-by |
Repo | |
Framework | |
Conditional Image-to-Image Translation
Title | Conditional Image-to-Image Translation |
Authors | Jianxin Lin, Yingce Xia, Tao Qin, Zhibo Chen, Tie-Yan Liu |
Abstract | Image-to-image translation tasks have been widely investigated with Generative Adversarial Networks (GANs) and dual learning. However, existing models lack the ability to control the translated results in the target domain and their results usually lack of diversity in the sense that a fixed image usually leads to (almost) deterministic translation result. In this paper, we study a new problem, conditional image-to-image translation, which is to translate an image from the source domain to the target domain conditioned on a given image in the target domain. It requires that the generated image should inherit some domain-specific features of the conditional image from the target domain. Therefore, changing the conditional image in the target domain will lead to diverse translation results for a fixed input image from the source domain, and therefore the conditional input image helps to control the translation results. We tackle this problem with unpaired data based on GANs and dual learning. We twist two conditional translation models (one translation from A domain to B domain, and the other one from B domain to A domain) together for inputs combination and reconstruction while preserving domain independent features. We carry out experiments on men’s faces from-to women’s faces translation and edges to shoes&bags translations. The results demonstrate the effectiveness of our proposed method. |
Tasks | Image-to-Image Translation |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00251v1 |
http://arxiv.org/pdf/1805.00251v1.pdf | |
PWC | https://paperswithcode.com/paper/conditional-image-to-image-translation |
Repo | |
Framework | |
Semantic Adversarial Deep Learning
Title | Semantic Adversarial Deep Learning |
Authors | Tommaso Dreossi, Somesh Jha, Sanjit A. Seshia |
Abstract | Fueled by massive amounts of data, models produced by machine-learning (ML) algorithms, especially deep neural networks, are being used in diverse domains where trustworthiness is a concern, including automotive systems, finance, health care, natural language processing, and malware detection. Of particular concern is the use of ML algorithms in cyber-physical systems (CPS), such as self-driving cars and aviation, where an adversary can cause serious consequences. However, existing approaches to generating adversarial examples and devising robust ML algorithms mostly ignore the semantics and context of the overall system containing the ML component. For example, in an autonomous vehicle using deep learning for perception, not every adversarial example for the neural network might lead to a harmful consequence. Moreover, one may want to prioritize the search for adversarial examples towards those that significantly modify the desired semantics of the overall system. Along the same lines, existing algorithms for constructing robust ML algorithms ignore the specification of the overall system. In this paper, we argue that the semantics and specification of the overall system has a crucial role to play in this line of research. We present preliminary research results that support this claim. |
Tasks | Malware Detection, Self-Driving Cars |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07045v2 |
http://arxiv.org/pdf/1804.07045v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-adversarial-deep-learning |
Repo | |
Framework | |
WebEye - Automated Collection of Malicious HTTP Traffic
Title | WebEye - Automated Collection of Malicious HTTP Traffic |
Authors | Johann Vierthaler, Roman Kruszelnicki, Julian Schütte |
Abstract | With malware detection techniques increasingly adopting machine learning approaches, the creation of precise training sets becomes more and more important. Large data sets of realistic web traffic, correctly classified as benign or malicious are needed, not only to train classic and deep learning algorithms, but also to serve as evaluation benchmarks for existing malware detection products. Interestingly, despite the vast number and versatility of threats a user may encounter when browsing the web, actual malicious content is often hard to come by, since prerequisites such as browser and operating system type and version must be met in order to receive the payload from a malware distributing server. In combination with privacy constraints on data sets of actual user traffic, it is difficult for researchers and product developers to evaluate anti-malware solutions against large-scale data sets of realistic web traffic. In this paper we present WebEye, a framework that autonomously creates realistic HTTP traffic, enriches recorded traffic with additional information, and classifies records as malicious or benign, using different classifiers. We are using WebEye to collect malicious HTML and JavaScript and show how datasets created with WebEye can be used to train machine learning based malware detection algorithms. We regard WebEye and the data sets it creates as a tool for researchers and product developers to evaluate and improve their AI-based anti-malware solutions against large-scale benchmarks. |
Tasks | Malware Detection |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.06012v1 |
http://arxiv.org/pdf/1802.06012v1.pdf | |
PWC | https://paperswithcode.com/paper/webeye-automated-collection-of-malicious-http |
Repo | |
Framework | |
NtMalDetect: A Machine Learning Approach to Malware Detection Using Native API System Calls
Title | NtMalDetect: A Machine Learning Approach to Malware Detection Using Native API System Calls |
Authors | Chan Woo Kim |
Abstract | As computing systems become increasingly advanced and as users increasingly engage themselves in technology, security has never been a greater concern. In malware detection, static analysis, the method of analyzing potentially malicious files, has been the prominent approach. This approach, however, quickly falls short as malicious programs become more advanced and adopt the capabilities of obfuscating its binaries to execute the same malicious functions, making static analysis extremely difficult for newer variants. The approach assessed in this paper is a novel dynamic malware analysis method, which may generalize better than static analysis to newer variants. Inspired by recent successes in Natural Language Processing (NLP), widely used document classification techniques were assessed in detecting malware by doing such analysis on system calls, which contain useful information about the operation of a program as requests that the program makes of the kernel. Features considered are extracted from system call traces of benign and malicious programs, and the task to classify these traces is treated as a binary document classification task of system call traces. The system call traces were processed to remove the parameters to only leave the system call function names. The features were grouped into various n-grams and weighted with Term Frequency-Inverse Document Frequency. This paper shows that Linear Support Vector Machines (SVM) optimized by Stochastic Gradient Descent and the traditional Coordinate Descent on the Wolfe Dual form of the SVM are effective in this approach, achieving a highest of 96% accuracy with 95% recall score. Additional contributions include the identification of significant system call sequences that could be avenues for further research. |
Tasks | Document Classification, Malware Detection |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05412v2 |
http://arxiv.org/pdf/1802.05412v2.pdf | |
PWC | https://paperswithcode.com/paper/ntmaldetect-a-machine-learning-approach-to |
Repo | |
Framework | |