Paper Group ANR 351
HWNet v2: An Efficient Word Image Representation for Handwritten Documents. Computer-aided diagnosis of lung carcinoma using deep learning - a pilot study. Invariant Representations from Adversarially Censored Autoencoders. Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces. A Machine Learning Approac …
HWNet v2: An Efficient Word Image Representation for Handwritten Documents
Title | HWNet v2: An Efficient Word Image Representation for Handwritten Documents |
Authors | Praveen Krishnan, C. V. Jawahar |
Abstract | We present a framework for learning an efficient holistic representation for handwritten word images. The proposed method uses a deep convolutional neural network with traditional classification loss. The major strengths of our work lie in: (i) the efficient usage of synthetic data to pre-train a deep network, (ii) an adapted version of the ResNet-34 architecture with the region of interest pooling (referred to as HWNet v2) which learns discriminative features for variable sized word images, and (iii) a realistic augmentation of training data with multiple scales and distortions which mimics the natural process of handwriting. We further investigate the process of transfer learning to reduce the domain gap between synthetic and real domain, and also analyze the invariances learned at different layers of the network using visualization techniques proposed in the literature. Our representation leads to a state-of-the-art word spotting performance on standard handwritten datasets and historical manuscripts in different languages with minimal representation size. On the challenging IAM dataset, our method is first to report an mAP of around 0.90 for word spotting with a representation size of just 32 dimensions. Furthermore, we also present results on printed document datasets in English and Indic scripts which validates the generic nature of the proposed framework for learning word image representation. |
Tasks | Transfer Learning |
Published | 2018-02-17 |
URL | http://arxiv.org/abs/1802.06194v2 |
http://arxiv.org/pdf/1802.06194v2.pdf | |
PWC | https://paperswithcode.com/paper/hwnet-v2-an-efficient-word-image |
Repo | |
Framework | |
Computer-aided diagnosis of lung carcinoma using deep learning - a pilot study
Title | Computer-aided diagnosis of lung carcinoma using deep learning - a pilot study |
Authors | Zhang Li, Zheyu Hu, Jiaolong Xu, Tao Tan, Hui Chen, Zhi Duan, Ping Liu, Jun Tang, Guoping Cai, Quchang Ouyang, Yuling Tang, Geert Litjens, Qiang Li |
Abstract | Aim: Early detection and correct diagnosis of lung cancer are the most important steps in improving patient outcome. This study aims to assess which deep learning models perform best in lung cancer diagnosis. Methods: Non-small cell lung carcinoma and small cell lung carcinoma biopsy specimens were consecutively obtained and stained. The specimen slides were diagnosed by two experienced pathologists (over 20 years). Several deep learning models were trained to discriminate cancer and non-cancer biopsies. Result: Deep learning models give reasonable AUC from 0.8810 to 0.9119. Conclusion: The deep learning analysis could help to speed up the detection process for the whole-slide image (WSI) and keep the comparable detection rate with human observer. |
Tasks | Lung Cancer Diagnosis |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05471v1 |
http://arxiv.org/pdf/1803.05471v1.pdf | |
PWC | https://paperswithcode.com/paper/computer-aided-diagnosis-of-lung-carcinoma |
Repo | |
Framework | |
Invariant Representations from Adversarially Censored Autoencoders
Title | Invariant Representations from Adversarially Censored Autoencoders |
Authors | Ye Wang, Toshiaki Koike-Akino, Deniz Erdogmus |
Abstract | We combine conditional variational autoencoders (VAE) with adversarial censoring in order to learn invariant representations that are disentangled from nuisance/sensitive variations. In this method, an adversarial network attempts to recover the nuisance variable from the representation, which the VAE is trained to prevent. Conditioning the decoder on the nuisance variable enables clean separation of the representation, since they are recombined for model learning and data reconstruction. We show this natural approach is theoretically well-founded with information-theoretic arguments. Experiments demonstrate that this method achieves invariance while preserving model learning performance, and results in visually improved performance for style transfer and generative sampling tasks. |
Tasks | Style Transfer |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.08097v1 |
http://arxiv.org/pdf/1805.08097v1.pdf | |
PWC | https://paperswithcode.com/paper/invariant-representations-from-adversarially |
Repo | |
Framework | |
Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces
Title | Medusa: A Scalable Interconnect for Many-Port DNN Accelerators and Wide DRAM Controller Interfaces |
Authors | Yongming Shen, Tianchu Ji, Michael Ferdman, Peter Milder |
Abstract | To cope with the increasing demand and computational intensity of deep neural networks (DNNs), industry and academia have turned to accelerator technologies. In particular, FPGAs have been shown to provide a good balance between performance and energy efficiency for accelerating DNNs. While significant research has focused on how to build efficient layer processors, the computational building blocks of DNN accelerators, relatively little attention has been paid to the on-chip interconnects that sit between the layer processors and the FPGA’s DRAM controller. We observe a disparity between DNN accelerator interfaces, which tend to comprise many narrow ports, and FPGA DRAM controller interfaces, which tend to be wide buses. This mismatch causes traditional interconnects to consume significant FPGA resources. To address this problem, we designed Medusa: an optimized FPGA memory interconnect which transposes data in the interconnect fabric, tailoring the interconnect to the needs of DNN layer processors. Compared to a traditional FPGA interconnect, our design can reduce LUT and FF use by 4.7x and 6.0x, and improves frequency by 1.8x. |
Tasks | |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.04013v1 |
http://arxiv.org/pdf/1807.04013v1.pdf | |
PWC | https://paperswithcode.com/paper/medusa-a-scalable-interconnect-for-many-port |
Repo | |
Framework | |
A Machine Learning Approach to Persian Text Readability Assessment Using a Crowdsourced Dataset
Title | A Machine Learning Approach to Persian Text Readability Assessment Using a Crowdsourced Dataset |
Authors | Hamid Mohammadi, Seyed Hossein Khasteh |
Abstract | An automated approach to text readability assessment is essential to a language and can be a powerful tool for improving the understandability of texts written and published in that language. However, the Persian language, which is spoken by over 110 million speakers, lacks such a system. Unlike other languages such as English, French, and Chinese, very limited research studies have been carried out to build an accurate and reliable text readability assessment system for the Persian language. In the present research, the first Persian dataset for text readability assessment was gathered and the first model for Persian text readability assessment using machine learning was introduced. The experiments showed that this model was accurate and could assess the readability of Persian texts with a high degree of confidence. The results of this study can be used in a number of applications such as medical and educational text readability evaluation and have the potential to be the cornerstone of future studies in Persian text readability assessment. |
Tasks | |
Published | 2018-10-07 |
URL | https://arxiv.org/abs/1810.06639v3 |
https://arxiv.org/pdf/1810.06639v3.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-approach-to-persian-text |
Repo | |
Framework | |
Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures
Title | Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures |
Authors | Jonathan Uesato, Ananya Kumar, Csaba Szepesvari, Tom Erez, Avraham Ruderman, Keith Anderson, Krishmamurthy, Dvijotham, Nicolas Heess, Pushmeet Kohli |
Abstract | This paper addresses the problem of evaluating learning systems in safety critical domains such as autonomous driving, where failures can have catastrophic consequences. We focus on two problems: searching for scenarios when learned agents fail and assessing their probability of failure. The standard method for agent evaluation in reinforcement learning, Vanilla Monte Carlo, can miss failures entirely, leading to the deployment of unsafe agents. We demonstrate this is an issue for current agents, where even matching the compute used for training is sometimes insufficient for evaluation. To address this shortcoming, we draw upon the rare event probability estimation literature and propose an adversarial evaluation approach. Our approach focuses evaluation on adversarially chosen situations, while still providing unbiased estimates of failure probabilities. The key difficulty is in identifying these adversarial situations – since failures are rare there is little signal to drive optimization. To solve this we propose a continuation approach that learns failure modes in related but less robust agents. Our approach also allows reuse of data already collected for training the agent. We demonstrate the efficacy of adversarial evaluation on two standard domains: humanoid control and simulated driving. Experimental results show that our methods can find catastrophic failures and estimate failures rates of agents multiple orders of magnitude faster than standard evaluation schemes, in minutes to hours rather than days. |
Tasks | Autonomous Driving |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01647v1 |
http://arxiv.org/pdf/1812.01647v1.pdf | |
PWC | https://paperswithcode.com/paper/rigorous-agent-evaluation-an-adversarial |
Repo | |
Framework | |
Object-based reasoning in VQA
Title | Object-based reasoning in VQA |
Authors | Mikyas T. Desta, Larry Chen, Tomasz Kornuta |
Abstract | Visual Question Answering (VQA) is a novel problem domain where multi-modal inputs must be processed in order to solve the task given in the form of a natural language. As the solutions inherently require to combine visual and natural language processing with abstract reasoning, the problem is considered as AI-complete. Recent advances indicate that using high-level, abstract facts extracted from the inputs might facilitate reasoning. Following that direction we decided to develop a solution combining state-of-the-art object detection and reasoning modules. The results, achieved on the well-balanced CLEVR dataset, confirm the promises and show significant, few percent improvements of accuracy on the complex “counting” task. |
Tasks | Object Detection, Question Answering, Visual Question Answering |
Published | 2018-01-29 |
URL | http://arxiv.org/abs/1801.09718v1 |
http://arxiv.org/pdf/1801.09718v1.pdf | |
PWC | https://paperswithcode.com/paper/object-based-reasoning-in-vqa |
Repo | |
Framework | |
Low-rank Approximation of Linear Maps
Title | Low-rank Approximation of Linear Maps |
Authors | Patrick Heas, Cedric Herzet |
Abstract | This work provides closed-form solutions and minimal achievable errors for a large class of low-rank approximation problems in Hilbert spaces. The proposed theorem generalizes to the case of linear bounded operators and p-th Schatten norms previous results obtained in the finite dimensional case for the Frobenius norm. The theorem is illustrated in various settings, including low-rank approximation problems with respect to the trace norm, the 2-induced norm or the Hilbert-Schmidt norm. The theorem provides also the basics for the design of tractable algorithms for kernel-based or continuous DMD |
Tasks | |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1812.09042v1 |
http://arxiv.org/pdf/1812.09042v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-approximation-of-linear-maps |
Repo | |
Framework | |
Automatic Transferring between Ancient Chinese and Contemporary Chinese
Title | Automatic Transferring between Ancient Chinese and Contemporary Chinese |
Authors | Zhiyuan Zhang, Wei Li, Xu Sun |
Abstract | During the long time of development, Chinese language has evolved a great deal. Native speakers now have difficulty in reading sentences written in ancient Chinese. In this paper, we propose an unsupervised algorithm that constructs sentence-aligned ancient-contemporary pairs out of the abundant passage-aligned corpus. With this method, we build a large parallel corpus. We propose to apply the sequence to sequence model to automatically transfer between ancient and contemporary Chinese sentences. Experiments show that both our alignment and transfer method can produce very good result except for some circumstances that even human translators can make mistakes without background knowledge. |
Tasks | |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01557v2 |
http://arxiv.org/pdf/1803.01557v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-transferring-between-ancient |
Repo | |
Framework | |
TNE: A Latent Model for Representation Learning on Networks
Title | TNE: A Latent Model for Representation Learning on Networks |
Authors | Abdulkadir Çelikkanat, Fragkiskos D. Malliaros |
Abstract | Network representation learning (NRL) methods aim to map each vertex into a low dimensional space by preserving the local and global structure of a given network, and in recent years they have received a significant attention thanks to their success in several challenging problems. Although various approaches have been proposed to compute node embeddings, many successful methods benefit from random walks in order to transform a given network into a collection of sequences of nodes and then they target to learn the representation of nodes by predicting the context of each vertex within the sequence. In this paper, we introduce a general framework to enhance the embeddings of nodes acquired by means of the random walk-based approaches. Similar to the notion of topical word embeddings in NLP, the proposed method assigns each vertex to a topic with the favor of various statistical models and community detection methods, and then generates the enhanced community representations. We evaluate our method on two downstream tasks: node classification and link prediction. The experimental results demonstrate that the incorporation of vertex and topic embeddings outperform widely-known baseline NRL methods. |
Tasks | Community Detection, Link Prediction, Node Classification, Representation Learning, Word Embeddings |
Published | 2018-10-16 |
URL | http://arxiv.org/abs/1810.06917v1 |
http://arxiv.org/pdf/1810.06917v1.pdf | |
PWC | https://paperswithcode.com/paper/tne-a-latent-model-for-representation |
Repo | |
Framework | |
Adversarial Bandits with Knapsacks
Title | Adversarial Bandits with Knapsacks |
Authors | Nicole Immorlica, Karthik Abinav Sankararaman, Robert Schapire, Aleksandrs Slivkins |
Abstract | We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a well-known knapsack problem: find an optimal packing of items into a limited-size knapsack. The BwK problem is a common generalization of numerous motivating examples, which range from dynamic pricing to repeated auctions to dynamic ad allocation to network routing and scheduling. While the prior work on BwK focused on the stochastic version, we pioneer the other extreme in which the outcomes can be chosen adversarially. This is a considerably harder problem, compared to both the stochastic version and the “classic” adversarial bandits, in that regret minimization is no longer feasible. Instead, the objective is to minimize the competitive ratio: the ratio of the benchmark reward to the algorithm’s reward. We design an algorithm with competitive ratio O(log T) relative to the best fixed distribution over actions, where T is the time horizon; we also prove a matching lower bound. The key conceptual contribution is a new perspective on the stochastic version of the problem. We suggest a new algorithm for the stochastic version, which builds on the framework of regret minimization in repeated games and admits a substantially simpler analysis compared to prior work. We then analyze this algorithm for the adversarial version and use it as a subroutine to solve the latter. |
Tasks | Multi-Armed Bandits |
Published | 2018-11-28 |
URL | https://arxiv.org/abs/1811.11881v5 |
https://arxiv.org/pdf/1811.11881v5.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-bandits-with-knapsacks |
Repo | |
Framework | |
The Emergence of Spectral Universality in Deep Networks
Title | The Emergence of Spectral Universality in Deep Networks |
Authors | Jeffrey Pennington, Samuel S. Schoenholz, Surya Ganguli |
Abstract | Recent work has shown that tight concentration of the entire spectrum of singular values of a deep network’s input-output Jacobian around one at initialization can speed up learning by orders of magnitude. Therefore, to guide important design choices, it is important to build a full theoretical understanding of the spectra of Jacobians at initialization. To this end, we leverage powerful tools from free probability theory to provide a detailed analytic understanding of how a deep network’s Jacobian spectrum depends on various hyperparameters including the nonlinearity, the weight and bias distributions, and the depth. For a variety of nonlinearities, our work reveals the emergence of new universal limiting spectral distributions that remain concentrated around one even as the depth goes to infinity. |
Tasks | |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09979v1 |
http://arxiv.org/pdf/1802.09979v1.pdf | |
PWC | https://paperswithcode.com/paper/the-emergence-of-spectral-universality-in |
Repo | |
Framework | |
A Method to Facilitate Cancer Detection and Type Classification from Gene Expression Data using a Deep Autoencoder and Neural Network
Title | A Method to Facilitate Cancer Detection and Type Classification from Gene Expression Data using a Deep Autoencoder and Neural Network |
Authors | Xi Chen, Jin Xie, Qingcong Yuan |
Abstract | With the increased affordability and availability of whole-genome sequencing, large-scale and high-throughput gene expression is widely used to characterize diseases, including cancers. However, establishing specificity in cancer diagnosis using gene expression data continues to pose challenges due to the high dimensionality and complexity of the data. Here we present models of deep learning (DL) and apply them to gene expression data for the diagnosis and categorization of cancer. In this study, we have developed two DL models using messenger ribonucleic acid (mRNA) datasets available from the Genomic Data Commons repository. Our models achieved 98% accuracy in cancer detection, with false negative and false positive rates below 1.7%. In our results, we demonstrated that 18 out of 32 cancer-typing classifications achieved more than 90% accuracy. Due to the limitation of a small sample size (less than 50 observations), certain cancers could not achieve a higher accuracy in typing classification, but still achieved high accuracy for the cancer detection task. To validate our models, we compared them with traditional statistical models. The main advantage of our models over traditional cancer detection is the ability to use data from various cancer types to automatically form features to enhance the detection and diagnosis of a specific cancer type. |
Tasks | |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1812.08674v1 |
http://arxiv.org/pdf/1812.08674v1.pdf | |
PWC | https://paperswithcode.com/paper/a-method-to-facilitate-cancer-detection-and |
Repo | |
Framework | |
Tracking Noisy Targets: A Review of Recent Object Tracking Approaches
Title | Tracking Noisy Targets: A Review of Recent Object Tracking Approaches |
Authors | Mustansar Fiaz, Arif Mahmood, Soon Ki Jung |
Abstract | Visual object tracking is an important computer vision problem with numerous real-world applications including human-computer interaction, autonomous vehicles, robotics, motion-based recognition, video indexing, surveillance and security. In this paper, we aim to extensively review the latest trends and advances in the tracking algorithms and evaluate the robustness of trackers in the presence of noise. The first part of this work comprises a comprehensive survey of recently proposed tracking algorithms. We broadly categorize trackers into correlation filter based trackers and the others as non-correlation filter trackers. Each category is further classified into various types of trackers based on the architecture of the tracking mechanism. In the second part of this work, we experimentally evaluate tracking algorithms for robustness in the presence of additive white Gaussian noise. Multiple levels of additive noise are added to the Object Tracking Benchmark (OTB) 2015, and the precision and success rates of the tracking algorithms are evaluated. Some algorithms suffered more performance degradation than others, which brings to light a previously unexplored aspect of the tracking algorithms. The relative rank of the algorithms based on their performance on benchmark datasets may change in the presence of noise. Our study concludes that no single tracker is able to achieve the same efficiency in the presence of noise as under noise-free conditions; thus, there is a need to include a parameter for robustness to noise when evaluating newly proposed tracking algorithms. |
Tasks | Autonomous Vehicles, Object Tracking, Visual Object Tracking |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.03098v2 |
http://arxiv.org/pdf/1802.03098v2.pdf | |
PWC | https://paperswithcode.com/paper/tracking-noisy-targets-a-review-of-recent |
Repo | |
Framework | |
Interactive Decomposition Multi-Objective Optimization via Progressively Learned Value Functions
Title | Interactive Decomposition Multi-Objective Optimization via Progressively Learned Value Functions |
Authors | Ke Li, Renzhi Chen, Dragan Savic, Xin Yao |
Abstract | Decomposition has become an increasingly popular technique for evolutionary multi-objective optimization (EMO). A decomposition-based EMO algorithm is usually designed to approximate a whole Pareto-optimal front (PF). However, in practice, the decision maker (DM) might only be interested in her/his region of interest (ROI), i.e., a part of the PF. Solutions outside that might be useless or even noisy to the decision-making procedure. Furthermore, there is no guarantee to find the preferred solutions when tackling many-objective problems. This paper develops an interactive framework for the decomposition-based EMO algorithm to lead a DM to the preferred solutions of her/his choice. It consists of three modules, i.e., consultation, preference elicitation and optimization. Specifically, after every several generations, the DM is asked to score a few candidate solutions in a consultation session. Thereafter, an approximated value function, which models the DM’s preference information, is progressively learned from the DM’s behavior. In the preference elicitation session, the preference information learned in the consultation module is translated into the form that can be used in a decomposition-based EMO algorithm, i.e., a set of reference points that are biased toward to the ROI. The optimization module, which can be any decomposition-based EMO algorithm in principle, utilizes the biased reference points to direct its search process. Extensive experiments on benchmark problems with three to ten objectives fully demonstrate the effectiveness of our proposed method for finding the DM’s preferred solutions. |
Tasks | Decision Making |
Published | 2018-01-02 |
URL | http://arxiv.org/abs/1801.00609v2 |
http://arxiv.org/pdf/1801.00609v2.pdf | |
PWC | https://paperswithcode.com/paper/interactive-decomposition-multi-objective |
Repo | |
Framework | |