October 19, 2019

3084 words 15 mins read

Paper Group ANR 353

A Hardware-Software Blueprint for Flexible Deep Learning Specialization. Large-scale Generative Modeling to Improve Automated Veterinary Disease Coding. Unsupervised Dimension Selection using a Blue Noise Spectrum. Shape analysis of framed space curves. Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation. Fine-Grained Land Use C …

A Hardware-Software Blueprint for Flexible Deep Learning Specialization


Title	A Hardware-Software Blueprint for Flexible Deep Learning Specialization
Authors	Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Abstract	Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility. Changes in algorithms, models, operators, or numerical systems threaten the viability of specialized hardware accelerators. We propose VTA, a programmable deep learning architecture template designed to be extensible in the face of evolving workloads. VTA achieves this flexibility via a parametrizable architecture, two-level ISA, and a JIT compiler. The two-level ISA is based on (1) a task-ISA that explicitly orchestrates concurrent compute and memory tasks and (2) a microcode-ISA which implements a wide variety of operators with single-cycle tensor-tensor operations. Next, we propose a runtime system equipped with a JIT compiler for flexible code-generation and heterogeneous execution that enables effective use of the VTA architecture. VTA is integrated and open-sourced into Apache TVM, a state-of-the-art deep learning compilation stack that provides flexibility for diverse models and divergent hardware backends. We propose a flow that performs design space exploration to generate a customized hardware architecture and software operator library that can be leveraged by mainstream learning frameworks. We demonstrate our approach by deploying optimized deep learning models used for object classification and style transfer on edge-class FPGAs.
Tasks	Code Generation, Object Classification, Style Transfer
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04188v3
PDF	http://arxiv.org/pdf/1807.04188v3.pdf
PWC	https://paperswithcode.com/paper/vta-an-open-hardware-software-stack-for-deep
Repo
Framework

Large-scale Generative Modeling to Improve Automated Veterinary Disease Coding


Title	Large-scale Generative Modeling to Improve Automated Veterinary Disease Coding
Authors	Yuhui Zhang, Allen Nie, James Zou
Abstract	Supervised learning is limited both by the quantity and quality of the labeled data. In the field of medical record tagging, writing styles between hospitals vary drastically. The knowledge learned from one hospital might not transfer well to another. This problem is amplified in veterinary medicine domain because veterinary clinics rarely apply medical codes to their records. We proposed and trained the first large-scale generative modeling algorithm in automated disease coding. We demonstrate that generative modeling can learn discriminative features when additionally trained with supervised fine-tuning. We systematically ablate and evaluate the effect of generative modeling on the final system’s performance. We compare the performance of our model with several baselines in a challenging cross-hospital setting with substantial domain shift. We outperform competitive baselines by a large margin. In addition, we provide interpretation for what is learned by our model.
Tasks
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11958v1
PDF	http://arxiv.org/pdf/1811.11958v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-generative-modeling-to-improve
Repo
Framework

Unsupervised Dimension Selection using a Blue Noise Spectrum


Title	Unsupervised Dimension Selection using a Blue Noise Spectrum
Authors	Jayaraman J. Thiagarajan, Rushil Anirudh, Rahul Sridhar, Peer-Timo Bremer
Abstract	Unsupervised dimension selection is an important problem that seeks to reduce dimensionality of data, while preserving the most useful characteristics. While dimensionality reduction is commonly utilized to construct low-dimensional embeddings, they produce feature spaces that are hard to interpret. Further, in applications such as sensor design, one needs to perform reduction directly in the input domain, instead of constructing transformed spaces. Consequently, dimension selection (DS) aims to solve the combinatorial problem of identifying the top-$k$ dimensions, which is required for effective experiment design, reducing data while keeping it interpretable, and designing better sensing mechanisms. In this paper, we develop a novel approach for DS based on graph signal analysis to measure feature influence. By analyzing synthetic graph signals with a blue noise spectrum, we show that we can measure the importance of each dimension. Using experiments in supervised learning and image masking, we demonstrate the superiority of the proposed approach over existing techniques in capturing crucial characteristics of high dimensional spaces, using only a small subset of the original features.
Tasks	Dimensionality Reduction
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13427v1
PDF	http://arxiv.org/pdf/1810.13427v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-dimension-selection-using-a-blue
Repo
Framework

Shape analysis of framed space curves


Title	Shape analysis of framed space curves
Authors	Tom Needham
Abstract	In the elastic shape analysis approach to shape matching and object classification, plane curves are represented as points in an infinite-dimensional Riemannian manifold, wherein shape dissimilarity is measured by geodesic distance. A remarkable result of Younes, Michor, Shah and Mumford says that the space of closed planar shapes, endowed with a natural metric, is isometric to an infinite-dimensional Grassmann manifold via the so-called square root transform. This result facilitates efficient shape comparison by virtue of explicit descriptions of Grassmannian geodesics. In this paper, we extend this shape analysis framework to treat shapes of framed space curves. By considering framed curves, we are able to generalize the square root transform by using quaternionic arithmetic and properties of the Hopf fibration. Under our coordinate transformation, the space of closed framed curves corresponds to an infinite-dimensional complex Grassmannian. This allows us to describe geodesics in framed curve space explicitly. We are also able to produce explicit geodesics between closed, unframed space curves by studying the action of the loop group of the circle on the Grassmann manifold. Averages of collections of plane and space curves are computed via a novel algorithm utilizing flag means.
Tasks	Object Classification
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03477v1
PDF	http://arxiv.org/pdf/1807.03477v1.pdf
PWC	https://paperswithcode.com/paper/shape-analysis-of-framed-space-curves
Repo
Framework

Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation


Title	Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation
Authors	Xingchao Peng, Ben Usman, Kuniaki Saito, Neela Kaushik, Judy Hoffman, Kate Saenko
Abstract	Unsupervised transfer of object recognition models from synthetic to real data is an important problem with many potential applications. The challenge is how to “adapt” a model trained on simulated images so that it performs well on real-world data without any additional supervision. Unfortunately, current benchmarks for this problem are limited in size and task diversity. In this paper, we present a new large-scale benchmark called Syn2Real, which consists of a synthetic domain rendered from 3D object models and two real-image domains containing the same object categories. We define three related tasks on this benchmark: closed-set object classification, open-set object classification, and object detection. Our evaluation of multiple state-of-the-art methods reveals a large gap in adaptation performance between the easier closed-set classification task and the more difficult open-set and detection tasks. We conclude that developing adaptation methods that work well across all three tasks presents a significant future challenge for syn2real domain transfer.
Tasks	Domain Adaptation, Object Classification, Object Detection, Object Recognition
Published	2018-06-26
URL	http://arxiv.org/abs/1806.09755v1
PDF	http://arxiv.org/pdf/1806.09755v1.pdf
PWC	https://paperswithcode.com/paper/syn2real-a-new-benchmark-forsynthetic-to-real
Repo
Framework

Fine-Grained Land Use Classification at the City Scale Using Ground-Level Images


Title	Fine-Grained Land Use Classification at the City Scale Using Ground-Level Images
Authors	Yi Zhu, Xueqing Deng, Shawn Newsam
Abstract	We perform fine-grained land use mapping at the city scale using ground-level images. Mapping land use is considerably more difficult than mapping land cover and is generally not possible using overhead imagery as it requires close-up views and seeing inside buildings. We postulate that the growing collections of georeferenced, ground-level images suggest an alternate approach to this geographic knowledge discovery problem. We develop a general framework that uses Flickr images to map 45 different land-use classes for the City of San Francisco. Individual images are classified using a novel convolutional neural network containing two streams, one for recognizing objects and another for recognizing scenes. This network is trained in an end-to-end manner directly on the labeled training images. We propose several strategies to overcome the noisiness of our user-generated data including search-based training set augmentation and online adaptive training. We derive a ground truth map of San Francisco in order to evaluate our method. We demonstrate the effectiveness of our approach through geo-visualization and quantitative analysis. Our framework achieves over 29% recall at the individual land parcel level which represents a strong baseline for the challenging 45-way land use classification problem especially given the noisiness of the image data.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02668v1
PDF	http://arxiv.org/pdf/1802.02668v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-land-use-classification-at-the
Repo
Framework

Comparative survey of visual object classifiers


Title	Comparative survey of visual object classifiers
Authors	Hiliwi Leake Kidane
Abstract	Classification of Visual Object Classes represents one of the most elaborated areas of interest in Computer Vision. It is always challenging to get one specific detector, descriptor or classifier that provides the expected object classification result. Consequently, it critical to compare the different detection, descriptor and classifier methods available and chose a single or combination of two or three to get an optimal result. In this paper, we have presented a comparative survey of different feature descriptors and classifiers. From feature descriptors, SIFT (Sparse & Dense) and HeuSIFT combination colour descriptors; From classification techniques, Support Vector Classifier, K-Nearest Neighbor, ADABOOST, and fisher are covered in comparative practical implementation survey.
Tasks	Object Classification
Published	2018-06-17
URL	http://arxiv.org/abs/1806.06321v1
PDF	http://arxiv.org/pdf/1806.06321v1.pdf
PWC	https://paperswithcode.com/paper/comparative-survey-of-visual-object
Repo
Framework

Relational inductive bias for physical construction in humans and machines


Title	Relational inductive bias for physical construction in humans and machines
Authors	Jessica B. Hamrick, Kelsey R. Allen, Victor Bapst, Tina Zhu, Kevin R. McKee, Joshua B. Tenenbaum, Peter W. Battaglia
Abstract	While current deep learning systems excel at tasks such as object classification, language processing, and gameplay, few can construct or modify a complex system such as a tower of blocks. We hypothesize that what these systems lack is a “relational inductive bias”: a capacity for reasoning about inter-object relations and making choices over a structured description of a scene. To test this hypothesis, we focus on a task that involves gluing pairs of blocks together to stabilize a tower, and quantify how well humans perform. We then introduce a deep reinforcement learning agent which uses object- and relation-centric scene and policy representations and apply it to the task. Our results show that these structured representations allow the agent to outperform both humans and more naive approaches, suggesting that relational inductive bias is an important component in solving structured reasoning problems and for building more intelligent, flexible machines.
Tasks	Object Classification
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01203v1
PDF	http://arxiv.org/pdf/1806.01203v1.pdf
PWC	https://paperswithcode.com/paper/relational-inductive-bias-for-physical
Repo
Framework

Decoding Generic Visual Representations From Human Brain Activity using Machine Learning


Title	Decoding Generic Visual Representations From Human Brain Activity using Machine Learning
Authors	Angeliki Papadimitriou, Nikolaos Passalis, Anastasios Tefas
Abstract	Among the most impressive recent applications of neural decoding is the visual representation decoding, where the category of an object that a subject either sees or imagines is inferred by observing his/her brain activity. Even though there is an increasing interest in the aforementioned visual representation decoding task, there is no extensive study of the effect of using different machine learning models on the decoding accuracy. In this paper we provide an extensive evaluation of several machine learning models, along with different similarity metrics, for the aforementioned task, drawing many interesting conclusions. That way, this paper a) paves the way for developing more advanced and accurate methods and b) provides an extensive and easily reproducible baseline for the aforementioned decoding task.
Tasks
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01757v1
PDF	http://arxiv.org/pdf/1811.01757v1.pdf
PWC	https://paperswithcode.com/paper/decoding-generic-visual-representations-from
Repo
Framework

Challenges for Toxic Comment Classification: An In-Depth Error Analysis


Title	Challenges for Toxic Comment Classification: An In-Depth Error Analysis
Authors	Betty van Aken, Julian Risch, Ralf Krestel, Alexander Löser
Abstract	Toxic comment classification has become an active research field with many recently proposed approaches. However, while these approaches address some of the task’s challenges others still remain unsolved and directions for further research are needed. To this end, we compare different deep learning and shallow approaches on a new, large comment dataset and propose an ensemble that outperforms all individual models. Further, we validate our findings on a second dataset. The results of the ensemble enable us to perform an extensive error analysis, which reveals open challenges for state-of-the-art methods and directions towards pending future research. These challenges include missing paradigmatic context and inconsistent dataset labels.
Tasks
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07572v1
PDF	http://arxiv.org/pdf/1809.07572v1.pdf
PWC	https://paperswithcode.com/paper/challenges-for-toxic-comment-classification
Repo
Framework

Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning


Title	Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning
Authors	Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, Dawei Yin
Abstract	Recommender systems play a crucial role in mitigating the problem of information overload by suggesting users’ personalized items or services. The vast majority of traditional recommender systems consider the recommendation procedure as a static process and make recommendations following a fixed strategy. In this paper, we propose a novel recommender system with the capability of continuously improving its strategies during the interactions with users. We model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users’ feedback. Users’ feedback can be positive and negative and both types of feedback have great potentials to boost recommendations. However, the number of negative feedback is much larger than that of positive one; thus incorporating them simultaneously is challenging since positive feedback could be buried by negative one. In this paper, we develop a novel approach to incorporate them into the proposed deep recommender system (DEERS) framework. The experimental results based on real-world e-commerce data demonstrate the effectiveness of the proposed framework. Further experiments have been conducted to understand the importance of both positive and negative feedback in recommendations.
Tasks	Recommendation Systems
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06501v3
PDF	http://arxiv.org/pdf/1802.06501v3.pdf
PWC	https://paperswithcode.com/paper/recommendations-with-negative-feedback-via
Repo
Framework

Towards Universal Representation for Unseen Action Recognition


Title	Towards Universal Representation for Unseen Action Recognition
Authors	Yi Zhu, Yang Long, Yu Guan, Shawn Newsam, Ling Shao
Abstract	Unseen Action Recognition (UAR) aims to recognise novel action categories without training examples. While previous methods focus on inner-dataset seen/unseen splits, this paper proposes a pipeline using a large-scale training source to achieve a Universal Representation (UR) that can generalise to a more realistic Cross-Dataset UAR (CD-UAR) scenario. We first address UAR as a Generalised Multiple-Instance Learning (GMIL) problem and discover ‘building-blocks’ from the large-scale ActivityNet dataset using distribution kernels. Essential visual and semantic components are preserved in a shared space to achieve the UR that can efficiently generalise to new datasets. Predicted UR exemplars can be improved by a simple semantic adaptation, and then an unseen action can be directly recognised using UR during the test. Without further training, extensive experiments manifest significant improvements over the UCF101 and HMDB51 benchmarks.
Tasks	Action Recognition In Videos, Multiple Instance Learning, Temporal Action Localization
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08460v1
PDF	http://arxiv.org/pdf/1803.08460v1.pdf
PWC	https://paperswithcode.com/paper/towards-universal-representation-for-unseen
Repo
Framework

A Robust Color Edge Detection Algorithm Based on Quaternion Hardy Filter


Title	A Robust Color Edge Detection Algorithm Based on Quaternion Hardy Filter
Authors	Wenshan Bi Dong Cheng, Kit Ian Kou
Abstract	This paper presents a robust filter called quaternion Hardy filter (QHF) for color image edge detection. The QHF can be capable of color edge feature enhancement and noise resistance. It is flexible to use QHF by selecting suitable parameters to handle different levels of noise. In particular, the quaternion analytic signal, which is an effective tool in color image processing, can also be produced by quaternion Hardy filtering with specific parameters. Based on the QHF and the improved Di Zenzo gradient operator, a novel color edge detection algorithm is proposed. Importantly, it can be efficiently implemented by using the fast discrete quaternion Fourier transform technique. The experiments demonstrate that the proposed algorithm outperforms several state-of-the-art algorithms.
Tasks	Edge Detection
Published	2018-07-17
URL	https://arxiv.org/abs/1807.10586v2
PDF	https://arxiv.org/pdf/1807.10586v2.pdf
PWC	https://paperswithcode.com/paper/a-robust-color-edge-detection-algorithm-based
Repo
Framework

On The Equivalence of Tries and Dendrograms - Efficient Hierarchical Clustering of Traffic Data


Title	On The Equivalence of Tries and Dendrograms - Efficient Hierarchical Clustering of Traffic Data
Authors	Chia-Tung Kuo, Ian Davidson
Abstract	The widespread use of GPS-enabled devices generates voluminous and continuous amounts of traffic data but analyzing such data for interpretable and actionable insights poses challenges. A hierarchical clustering of the trips has many uses such as discovering shortest paths, common routes and often traversed areas. However, hierarchical clustering typically has time complexity of $O(n^2 \log n)$ where $n$ is the number of instances, and is difficult to scale to large data sets associated with GPS data. Furthermore, incremental hierarchical clustering is still a developing area. Prefix trees (also called tries) can be efficiently constructed and updated in linear time (in $n$). We show how a specially constructed trie can compactly store the trips and further show this trie is equivalent to a dendrogram that would have been built by classic agglomerative hierarchical algorithms using a specific distance metric. This allows creating hierarchical clusterings of GPS trip data and updating this hierarchy in linear time. %we can extract a meaningful kernel and can also interpret the structure as clusterings of differing granularity as one progresses down the tree. We demonstrate the usefulness of our proposed approach on a real world data set of half a million taxis’ GPS traces, well beyond the capabilities of agglomerative clustering methods. Our work is not limited to trip data and can be used with other data with a string representation.
Tasks
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05357v1
PDF	http://arxiv.org/pdf/1810.05357v1.pdf
PWC	https://paperswithcode.com/paper/on-the-equivalence-of-tries-and-dendrograms
Repo
Framework

Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions


Title	Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions
Authors	Nicola Milano, Stefano Nolfi
Abstract	We demonstrate how efficiency of Cartesian Genetic Programming method can be scaled up through the preferential selection of phenotypically larger solutions, i.e. through the preferential selection of larger solutions among equally good solutions. The advantage of the preferential selection of larger solutions is validated on the six, seven and eight-bit parity problems, on a dynamically varying problem involving the classification of binary patterns, and on the Paige regression problem. In all cases, the preferential selection of larger solutions provides an advantage in term of the performance of the evolved solutions and in term of speed, the number of evaluations required to evolve optimal or high-quality solutions. The advantage provided by the preferential selection of larger solutions can be further extended by self-adapting the mutation rate through the one-fifth success rule. Finally, for problems like the Paige regression in which neutrality plays a minor role, the advantage of the preferential selection of larger solutions can be extended by preferring larger solutions also among quasi-neutral alternative candidate solutions, i.e. solutions achieving slightly different performance.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09485v1
PDF	http://arxiv.org/pdf/1810.09485v1.pdf
PWC	https://paperswithcode.com/paper/scaling-up-cartesian-genetic-programming
Repo
Framework