Paper Group ANR 500
Negative Log Likelihood Ratio Loss for Deep Neural Network Classification. Deep Learning for Genomics: A Concise Overview. Relevant Attributes in Formal Contexts. Is Data Clustering in Adversarial Settings Secure?. Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer. A Foreground Inference Network for Video Surve …
Negative Log Likelihood Ratio Loss for Deep Neural Network Classification
Title | Negative Log Likelihood Ratio Loss for Deep Neural Network Classification |
Authors | Donglai Zhu, Hengshuai Yao, Bei Jiang, Peng Yu |
Abstract | In deep neural network, the cross-entropy loss function is commonly used for classification. Minimizing cross-entropy is equivalent to maximizing likelihood under assumptions of uniform feature and class distributions. It belongs to generative training criteria which does not directly discriminate correct class from competing classes. We propose a discriminative loss function with negative log likelihood ratio between correct and competing classes. It significantly outperforms the cross-entropy loss on the CIFAR-10 image classification task. |
Tasks | Image Classification |
Published | 2018-04-27 |
URL | http://arxiv.org/abs/1804.10690v1 |
http://arxiv.org/pdf/1804.10690v1.pdf | |
PWC | https://paperswithcode.com/paper/negative-log-likelihood-ratio-loss-for-deep |
Repo | |
Framework | |
Deep Learning for Genomics: A Concise Overview
Title | Deep Learning for Genomics: A Concise Overview |
Authors | Tianwei Yue, Haohan Wang |
Abstract | Advancements in genomic research such as high-throughput sequencing techniques have driven modern genomic studies into “big data” disciplines. This data explosion is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out potential opportunities and obstacles for future genomics applications. |
Tasks | |
Published | 2018-02-02 |
URL | http://arxiv.org/abs/1802.00810v2 |
http://arxiv.org/pdf/1802.00810v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-genomics-a-concise-overview |
Repo | |
Framework | |
Relevant Attributes in Formal Contexts
Title | Relevant Attributes in Formal Contexts |
Authors | Tom Hanika, Maren Koyda, Gerd Stumme |
Abstract | Computing conceptual structures, like formal concept lattices, is in the age of massive data sets a challenging task. There are various approaches to deal with this, e.g., random sampling, parallelization, or attribute extraction. A so far not investigated method in the realm of formal concept analysis is attribute selection, as done in machine learning. Building up on this we introduce a method for attribute selection in formal contexts. To this end, we propose the notion of relevant attributes which enables us to define a relative relevance function, reflecting both the order structure of the concept lattice as well as distribution of objects on it. Finally, we overcome computational challenges for computing the relative relevance through an approximation approach based on information entropy. |
Tasks | |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1812.08868v1 |
http://arxiv.org/pdf/1812.08868v1.pdf | |
PWC | https://paperswithcode.com/paper/relevant-attributes-in-formal-contexts |
Repo | |
Framework | |
Is Data Clustering in Adversarial Settings Secure?
Title | Is Data Clustering in Adversarial Settings Secure? |
Authors | Battista Biggio, Ignazio Pillai, Samuel Rota Bulò, Davide Ariu, Marcello Pelillo, Fabio Roli |
Abstract | Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversary’s goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits. |
Tasks | |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.09982v1 |
http://arxiv.org/pdf/1811.09982v1.pdf | |
PWC | https://paperswithcode.com/paper/is-data-clustering-in-adversarial-settings |
Repo | |
Framework | |
Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer
Title | Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer |
Authors | Richard Yuanzhe Pang, Kevin Gimpel |
Abstract | We consider the problem of automatically generating textual paraphrases with modified attributes or properties, focusing on the setting without parallel data (Hu et al., 2017; Shen et al., 2017). This setting poses challenges for evaluation. We show that the metric of post-transfer classification accuracy is insufficient on its own, and propose additional metrics based on semantic preservation and fluency as well as a way to combine them into a single overall score. We contribute new loss functions and training strategies to address the different metrics. Semantic preservation is addressed by adding a cyclic consistency loss and a loss based on paraphrase pairs, while fluency is improved by integrating losses based on style-specific language models. We experiment with a Yelp sentiment dataset and a new literature dataset that we propose, using multiple models that extend prior work (Shen et al., 2017). We demonstrate that our metrics correlate well with human judgments, at both the sentence-level and system-level. Automatic and manual evaluation also show large improvements over the baseline method of Shen et al. (2017). We hope that our proposed metrics can speed up system development for new textual transfer tasks while also encouraging the community to address our three complementary aspects of transfer quality. |
Tasks | |
Published | 2018-10-28 |
URL | https://arxiv.org/abs/1810.11878v2 |
https://arxiv.org/pdf/1810.11878v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-criteria-and-evaluation-metrics-for |
Repo | |
Framework | |
A Foreground Inference Network for Video Surveillance Using Multi-View Receptive Field
Title | A Foreground Inference Network for Video Surveillance Using Multi-View Receptive Field |
Authors | Thangarajah Akilan |
Abstract | Foreground (FG) pixel labelling plays a vital role in video surveillance. Recent engineering solutions have attempted to exploit the efficacy of deep learning (DL) models initially targeted for image classification to deal with FG pixel labelling. One major drawback of such strategy is the lacking delineation of visual objects when training samples are limited. To grapple with this issue, we introduce a multi-view receptive field fully convolutional neural network (MV-FCN) that harness recent seminal ideas, such as, fully convolutional structure, inception modules, and residual networking. Therefrom, we implement a system in an encoder-decoder fashion that subsumes a core and two complementary feature flow paths. The model exploits inception modules at early and late stages with three different sizes of receptive fields to capture invariance at various scales. The features learned in the encoding phase are fused with appropriate feature maps in the decoding phase through residual connections for achieving enhanced spatial representation. These multi-view receptive fields and residual feature connections are expected to yield highly generalized features for an accurate pixel-wise FG region identification. It is, then, trained with database specific exemplary segmentations to predict desired FG objects. The comparative experimental results on eleven benchmark datasets validate that the proposed model achieves very competitive performance with the prior- and state-of-the-art algorithms. We also report that how well a transfer learning approach can be useful to enhance the performance of our proposed MV-FCN. |
Tasks | Image Classification, Transfer Learning |
Published | 2018-01-19 |
URL | http://arxiv.org/abs/1801.06593v1 |
http://arxiv.org/pdf/1801.06593v1.pdf | |
PWC | https://paperswithcode.com/paper/a-foreground-inference-network-for-video |
Repo | |
Framework | |
Endmember Extraction on the Grassmannian
Title | Endmember Extraction on the Grassmannian |
Authors | Elin Farnell, Henry Kvinge, Michael Kirby, Chris Peterson |
Abstract | Endmember extraction plays a prominent role in a variety of data analysis problems as endmembers often correspond to data representing the purest or best representative of some feature. Identifying endmembers then can be useful for further identification and classification tasks. In settings with high-dimensional data, such as hyperspectral imagery, it can be useful to consider endmembers that are subspaces as they are capable of capturing a wider range of variations of a signature. The endmember extraction problem in this setting thus translates to finding the vertices of the convex hull of a set of points on a Grassmannian. In the presence of noise, it can be less clear whether a point should be considered a vertex. In this paper, we propose an algorithm to extract endmembers on a Grassmannian, identify subspaces of interest that lie near the boundary of a convex hull, and demonstrate the use of the algorithm on a synthetic example and on the 220 spectral band AVIRIS Indian Pines hyperspectral image. |
Tasks | |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.01401v1 |
http://arxiv.org/pdf/1807.01401v1.pdf | |
PWC | https://paperswithcode.com/paper/endmember-extraction-on-the-grassmannian |
Repo | |
Framework | |
Using Motion and Internal Supervision in Object Recognition
Title | Using Motion and Internal Supervision in Object Recognition |
Authors | Daniel Harari |
Abstract | In this thesis we address two related aspects of visual object recognition: the use of motion information, and the use of internal supervision, to help unsupervised learning. These two aspects are inter-related in the current study, since image motion is used for internal supervision, via the detection of spatiotemporal events of active-motion and the use of tracking. Most current work in object recognition deals with static images during both learning and recognition. In contrast, we are interested in a dynamic scene where visual processes, such as detecting motion events and tracking, contribute spatiotemporal information, which is useful for object attention, motion segmentation, 3-D understanding and object interactions. We explore the use of these sources of information in both learning and recognition processes. In the first part of the work, we demonstrate how motion can be used for adaptive detection of object-parts in dynamic environments, while automatically learning new object appearances and poses. In the second and main part of the study we develop methods for using specific types of visual motion to solve two difficult problems in unsupervised visual learning: learning to recognize hands by their appearance and by context, and learning to extract direction of gaze. We use our conclusions in this part to propose a model for several aspects of learning by human infants from their visual environment. |
Tasks | Motion Segmentation, Object Recognition |
Published | 2018-12-13 |
URL | http://arxiv.org/abs/1812.05455v1 |
http://arxiv.org/pdf/1812.05455v1.pdf | |
PWC | https://paperswithcode.com/paper/using-motion-and-internal-supervision-in |
Repo | |
Framework | |
Reinforcement-learning-based architecture for automated quantum adiabatic algorithm design
Title | Reinforcement-learning-based architecture for automated quantum adiabatic algorithm design |
Authors | Jian Lin, Zhong Yuan Lai, Xiaopeng Li |
Abstract | Quantum algorithm design lies in the hallmark of applications of quantum computation and quantum simulation. Here we put forward a deep reinforcement learning (RL) architecture for automated algorithm design in the framework of quantum adiabatic algorithm, where the optimal Hamiltonian path to reach a quantum ground state that encodes a compution problem is obtained by RL techniques. We benchmark our approach in Grover search and 3-SAT problems, and find that the adiabatic algorithm obtained by our RL approach leads to significant improvement in the success probability and computing speedups for both moderate and large number of qubits compared to conventional algorithms. The RL-designed algorithm is found to be qualitatively distinct from the linear algorithm in the resultant distribution of success probability. Considering the established complexity-equivalence of circuit and adiabatic quantum algorithms, we expect the RL-designed adiabatic algorithm to inspire novel circuit algorithms as well. Our approach offers a recipe to design quantum algorithms for generic problems through a machinery RL process, which paves a novel way to automated quantum algorithm design using artificial intelligence, potentially applicable to different quantum simulation and computation platforms from trapped ions and optical lattices to superconducting-qubit devices. |
Tasks | |
Published | 2018-12-27 |
URL | https://arxiv.org/abs/1812.10797v2 |
https://arxiv.org/pdf/1812.10797v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-architecture-for |
Repo | |
Framework | |
GeneSys: Enabling Continuous Learning through Neural Network Evolution in Hardware
Title | GeneSys: Enabling Continuous Learning through Neural Network Evolution in Hardware |
Authors | Ananda Samajdar, Parth Mannan, Kartikay Garg, Tushar Krishna |
Abstract | Modern deep learning systems rely on (a) a hand-tuned neural network topology, (b) massive amounts of labeled training data, and (c) extensive training over large-scale compute resources to build a system that can perform efficient image classification or speech recognition. Unfortunately, we are still far away from implementing adaptive general purpose intelligent systems which would need to learn autonomously in unknown environments and may not have access to some or any of these three components. Reinforcement learning and evolutionary algorithm (EA) based methods circumvent this problem by continuously interacting with the environment and updating the models based on obtained rewards. However, deploying these algorithms on ubiquitous autonomous agents at the edge (robots/drones) demands extremely high energy-efficiency due to (i) tight power and energy budgets, (ii) continuous/lifelong interaction with the environment, (iii) intermittent or no connectivity to the cloud to run heavy-weight processing. To address this need, we present GENESYS, an HW-SW prototype of an EA-based learning system, that comprises a closed loop learning engine called EvE and an inference engine called ADAM. EvE can evolve the topology and weights of neural networks completely in hardware for the task at hand, without requiring hand-optimization or backpropagation training. ADAM continuously interacts with the environment and is optimized for efficiently running the irregular neural networks generated by EvE. GENESYS identifies and leverages multiple unique avenues of parallelism unique to EAs that we term ‘gene’- level parallelism, and ‘population’-level parallelism. We ran GENESYS with a suite of environments from OpenAI gym and observed 2-5 orders of magnitude higher energy-efficiency over state-of-the-art embedded and desktop CPU and GPU systems. |
Tasks | Image Classification, Speech Recognition |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01363v2 |
http://arxiv.org/pdf/1808.01363v2.pdf | |
PWC | https://paperswithcode.com/paper/genesys-enabling-continuous-learning-through |
Repo | |
Framework | |
Robustness of sentence length measures in written texts
Title | Robustness of sentence length measures in written texts |
Authors | Denner S. Vieira, Sergio Picoli, Renio S. Mendes |
Abstract | Hidden structural patterns in written texts have been subject of considerable research in the last decades. In particular, mapping a text into a time series of sentence lengths is a natural way to investigate text structure. Typically, sentence length has been quantified by using measures based on the number of words and the number of characters, but other variations are possible. To quantify the robustness of different sentence length measures, we analyzed a database containing about five hundred books in English. For each book, we extracted six distinct measures of sentence length, including number of words and number of characters (taking into account lemmatization and stop words removal). We compared these six measures for each book by using i) Pearson’s coefficient to investigate linear correlations; ii) Kolmogorov–Smirnov test to compare distributions; and iii) detrended fluctuation analysis (DFA) to quantify auto-correlations. We have found that all six measures exhibit very similar behavior, suggesting that sentence length is a robust measure related to text structure. |
Tasks | Lemmatization, Time Series |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.01460v1 |
http://arxiv.org/pdf/1805.01460v1.pdf | |
PWC | https://paperswithcode.com/paper/robustness-of-sentence-length-measures-in |
Repo | |
Framework | |
A Dense CNN approach for skin lesion classification
Title | A Dense CNN approach for skin lesion classification |
Authors | Pierluigi Carcagnì, Andrea Cuna, Cosimo Distante |
Abstract | This article presents a Deep CNN, based on the DenseNet architecture jointly with a highly discriminating learning methodology, in order to classify seven kinds of skin lesions: Melanoma, Melanocytic nevus, Basal cell carcinoma, Actinic keratosis / Bowen’s disease, Benign keratosis, Dermatofibroma, Vascular lesion. In particular a 61 layers DenseNet, pre-trained on IMAGENET dataset, has been fine-tuned on ISIC 2018 Task 3 Challenge Dataset exploiting a Center Loss function. |
Tasks | Skin Lesion Classification |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06416v2 |
http://arxiv.org/pdf/1807.06416v2.pdf | |
PWC | https://paperswithcode.com/paper/a-dense-cnn-approach-for-skin-lesion |
Repo | |
Framework | |
Semantic Segmentation with Scarce Data
Title | Semantic Segmentation with Scarce Data |
Authors | Isay Katsman, Rohun Tripathi, Andreas Veit, Serge Belongie |
Abstract | Semantic segmentation is a challenging vision problem that usually necessitates the collection of large amounts of finely annotated data, which is often quite expensive to obtain. Coarsely annotated data provides an interesting alternative as it is usually substantially more cheap. In this work, we present a method to leverage coarsely annotated data along with fine supervision to produce better segmentation results than would be obtained when training using only the fine data. We validate our approach by simulating a scarce data setting with less than 200 low resolution images from the Cityscapes dataset and show that our method substantially outperforms solely training on the fine annotation data by an average of 15.52% mIoU and outperforms the coarse mask by an average of 5.28% mIoU. |
Tasks | Semantic Segmentation |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00911v2 |
http://arxiv.org/pdf/1807.00911v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-segmentation-with-scarce-data |
Repo | |
Framework | |
Explainable Genetic Inheritance Pattern Prediction
Title | Explainable Genetic Inheritance Pattern Prediction |
Authors | Edmond Cunningham, Dana Schlegel, Andrew DeOrio |
Abstract | Diagnosing an inherited disease often requires identifying the pattern of inheritance in a patient’s family. We represent family trees with genetic patterns of inheritance using hypergraphs and latent state space models to provide explainable inheritance pattern predictions. Our approach allows for exact causal inference over a patient’s possible genotypes given their relatives’ phenotypes. By design, inference can be examined at a low level to provide explainable predictions. Furthermore, we make use of human intuition by providing a method to assign hypothetical evidence to any inherited gene alleles. Our analysis supports the application of latent state space models to improve patient care in cases of rare inherited diseases where access to genetic specialists is limited. |
Tasks | Causal Inference |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.00259v3 |
http://arxiv.org/pdf/1812.00259v3.pdf | |
PWC | https://paperswithcode.com/paper/explainable-genetic-inheritance-pattern |
Repo | |
Framework | |
Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate
Title | Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate |
Authors | Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, Kun Gai |
Abstract | Estimating post-click conversion rate (CVR) accurately is crucial for ranking systems in industrial applications such as recommendation and advertising. Conventional CVR modeling applies popular deep learning methods and achieves state-of-the-art performance. However it encounters several task-specific problems in practice, making CVR modeling challenging. For example, conventional CVR models are trained with samples of clicked impressions while utilized to make inference on the entire space with samples of all impressions. This causes a sample selection bias problem. Besides, there exists an extreme data sparsity problem, making the model fitting rather difficult. In this paper, we model CVR in a brand-new perspective by making good use of sequential pattern of user actions, i.e., impression -> click -> conversion. The proposed Entire Space Multi-task Model (ESMM) can eliminate the two problems simultaneously by i) modeling CVR directly over the entire space, ii) employing a feature representation transfer learning strategy. Experiments on dataset gathered from Taobao’s recommender system demonstrate that ESMM significantly outperforms competitive methods. We also release a sampling version of this dataset to enable future research. To the best of our knowledge, this is the first public dataset which contains samples with sequential dependence of click and conversion labels for CVR modeling. |
Tasks | Recommendation Systems, Transfer Learning |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07931v2 |
http://arxiv.org/pdf/1804.07931v2.pdf | |
PWC | https://paperswithcode.com/paper/entire-space-multi-task-model-an-effective |
Repo | |
Framework | |