October 17, 2019

3039 words 15 mins read

Paper Group ANR 844

DeepProteomics: Protein family classification using Shallow and Deep Networks. End-to-End Streaming Keyword Spotting. Tattoo Image Search at Scale: Joint Detection and Compact Representation Learning. Color naming guided intrinsic image decomposition. Deep Appearance Models for Face Rendering. Class label autoencoder for zero-shot learning. Groundi …

DeepProteomics: Protein family classification using Shallow and Deep Networks


Title	DeepProteomics: Protein family classification using Shallow and Deep Networks
Authors	Anu Vazhayil, Vinayakumar R, Soman KP
Abstract	The knowledge regarding the function of proteins is necessary as it gives a clear picture of biological processes. Nevertheless, there are many protein sequences found and added to the databases but lacks functional annotation. The laboratory experiments take a considerable amount of time for annotation of the sequences. This arises the need to use computational techniques to classify proteins based on their functions. In our work, we have collected the data from Swiss-Prot containing 40433 proteins which is grouped into 30 families. We pass it to recurrent neural network(RNN), long short term memory(LSTM) and gated recurrent unit(GRU) model and compare it by applying trigram with deep neural network and shallow neural network on the same dataset. Through this approach, we could achieve maximum of around 78% accuracy for the classification of protein families.
Tasks
Published	2018-09-11
URL	http://arxiv.org/abs/1809.04461v1
PDF	http://arxiv.org/pdf/1809.04461v1.pdf
PWC	https://paperswithcode.com/paper/deepproteomics-protein-family-classification
Repo
Framework

End-to-End Streaming Keyword Spotting


Title	End-to-End Streaming Keyword Spotting
Authors	Alvarez Raziel, Park Hyun-Jin
Abstract	We present a system for keyword spotting that, except for a frontend component for feature generation, it is entirely contained in a deep neural network (DNN) model trained “end-to-end” to predict the presence of the keyword in a stream of audio. The main contributions of this work are, first, an efficient memoized neural network topology that aims at making better use of the parameters and associated computations in the DNN by holding a memory of previous activations distributed over the depth of the DNN. The second contribution is a method to train the DNN, end-to-end, to produce the keyword spotting score. This system significantly outperforms previous approaches both in terms of quality of detection as well as size and computation.
Tasks	Keyword Spotting
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02802v2
PDF	http://arxiv.org/pdf/1812.02802v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-streaming-keyword-spotting
Repo
Framework

Tattoo Image Search at Scale: Joint Detection and Compact Representation Learning


Title	Tattoo Image Search at Scale: Joint Detection and Compact Representation Learning
Authors	Hu Han, Jie Li, Anil K. Jain, Shiguang Shan, Xilin Chen
Abstract	The explosive growth of digital images in video surveillance and social media has led to the significant need for efficient search of persons of interest in law enforcement and forensic applications. Despite tremendous progress in primary biometric traits (e.g., face and fingerprint) based person identification, a single biometric trait alone cannot meet the desired recognition accuracy in forensic scenarios. Tattoos, as one of the important soft biometric traits, have been found to be valuable for assisting in person identification. However, tattoo search in a large collection of unconstrained images remains a difficult problem, and existing tattoo search methods mainly focus on matching cropped tattoos, which is different from real application scenarios. To close the gap, we propose an efficient tattoo search approach that is able to learn tattoo detection and compact representation jointly in a single convolutional neural network (CNN) via multi-task learning. While the features in the backbone network are shared by both tattoo detection and compact representation learning, individual latent layers of each sub-network optimize the shared features toward the detection and feature learning tasks, respectively. We resolve the small batch size issue inside the joint tattoo detection and compact representation learning network via random image stitch and preceding feature buffering. We evaluate the proposed tattoo search system using multiple public-domain tattoo benchmarks, and a gallery set with about 300K distracter tattoo images compiled from these datasets and images from the Internet. In addition, we also introduce a tattoo sketch dataset containing 300 tattoos for sketch-based tattoo search. Experimental results show that the proposed approach has superior performance in tattoo detection and tattoo search at scale compared to several state-of-the-art tattoo retrieval algorithms.
Tasks	Image Retrieval, Multi-Task Learning, Person Identification, Representation Learning
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00218v1
PDF	http://arxiv.org/pdf/1811.00218v1.pdf
PWC	https://paperswithcode.com/paper/tattoo-image-search-at-scale-joint-detection
Repo
Framework

Color naming guided intrinsic image decomposition


Title	Color naming guided intrinsic image decomposition
Authors	Yuanliu Liu, Zejian Yuan
Abstract	Intrinsic image decomposition is a severely under-constrained problem. User interactions can help to reduce the ambiguity of the decomposition considerably. The traditional way of user interaction is to draw scribbles that indicate regions with constant reflectance or shading. However the effect scopes of the scribbles are quite limited, so dozens of scribbles are often needed to rectify the whole decomposition, which is time consuming. In this paper we propose an efficient way of user interaction that users need only to annotate the color composition of the image. Color composition reveals the global distribution of reflectance, so it can help to adapt the whole decomposition directly. We build a generative model of the process that the albedo of the material produces both the reflectance through imaging and the color labels by color naming. Our model fuses effectively the physical properties of image formation and the top-down information from human color perception. Experimental results show that color naming can improve the performance of intrinsic image decomposition, especially in cleaning the shadows left in reflectance and solving the color constancy problem.
Tasks	Color Constancy, Intrinsic Image Decomposition
Published	2018-10-23
URL	http://arxiv.org/abs/1810.09720v1
PDF	http://arxiv.org/pdf/1810.09720v1.pdf
PWC	https://paperswithcode.com/paper/color-naming-guided-intrinsic-image
Repo
Framework

Deep Appearance Models for Face Rendering


Title	Deep Appearance Models for Face Rendering
Authors	Stephen Lombardi, Jason Saragih, Tomas Simon, Yaser Sheikh
Abstract	We introduce a deep appearance model for rendering the human face. Inspired by Active Appearance Models, we develop a data-driven rendering pipeline that learns a joint representation of facial geometry and appearance from a multiview capture setup. Vertex positions and view-specific textures are modeled using a deep variational autoencoder that captures complex nonlinear effects while producing a smooth and compact latent representation. View-specific texture enables the modeling of view-dependent effects such as specularity. In addition, it can also correct for imperfect geometry stemming from biased or low resolution estimates. This is a significant departure from the traditional graphics pipeline, which requires highly accurate geometry as well as all elements of the shading model to achieve realism through physically-inspired light transport. Acquiring such a high level of accuracy is difficult in practice, especially for complex and intricate parts of the face, such as eyelashes and the oral cavity. These are handled naturally by our approach, which does not rely on precise estimates of geometry. Instead, the shading model accommodates deficiencies in geometry though the flexibility afforded by the neural network employed. At inference time, we condition the decoding network on the viewpoint of the camera in order to generate the appropriate texture for rendering. The resulting system can be implemented simply using existing rendering engines through dynamic textures with flat lighting. This representation, together with a novel unsupervised technique for mapping images to facial states, results in a system that is naturally suited to real-time interactive settings such as Virtual Reality (VR).
Tasks
Published	2018-08-01
URL	http://arxiv.org/abs/1808.00362v1
PDF	http://arxiv.org/pdf/1808.00362v1.pdf
PWC	https://paperswithcode.com/paper/deep-appearance-models-for-face-rendering
Repo
Framework

Class label autoencoder for zero-shot learning


Title	Class label autoencoder for zero-shot learning
Authors	Guangfeng Lin, Caixia Fan, Wanjun Chen, Yajun Chen, Fan Zhao
Abstract	Existing zero-shot learning (ZSL) methods usually learn a projection function between a feature space and a semantic embedding space(text or attribute space) in the training seen classes or testing unseen classes. However, the projection function cannot be used between the feature space and multi-semantic embedding spaces, which have the diversity characteristic for describing the different semantic information of the same class. To deal with this issue, we present a novel method to ZSL based on learning class label autoencoder (CLA). CLA can not only build a uniform framework for adapting to multi-semantic embedding spaces, but also construct the encoder-decoder mechanism for constraining the bidirectional projection between the feature space and the class label space. Moreover, CLA can jointly consider the relationship of feature classes and the relevance of the semantic classes for improving zero-shot classification. The CLA solution can provide both unseen class labels and the relation of the different classes representation(feature or semantic information) that can encode the intrinsic structure of classes. Extensive experiments demonstrate the CLA outperforms state-of-art methods on four benchmark datasets, which are AwA, CUB, Dogs and ImNet-2.
Tasks	Zero-Shot Learning
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08301v1
PDF	http://arxiv.org/pdf/1801.08301v1.pdf
PWC	https://paperswithcode.com/paper/class-label-autoencoder-for-zero-shot
Repo
Framework

Grounding the Experience of a Visual Field through Sensorimotor Contingencies


Title	Grounding the Experience of a Visual Field through Sensorimotor Contingencies
Authors	Alban Laflaquière
Abstract	Artificial perception is traditionally handled by hand-designing task specific algorithms. However, a truly autonomous robot should develop perceptive abilities on its own, by interacting with its environment, and adapting to new situations. The sensorimotor contingencies theory proposes to ground the development of those perceptive abilities in the way the agent can actively transform its sensory inputs. We propose a sensorimotor approach, inspired by this theory, in which the agent explores the world and discovers its properties by capturing the sensorimotor regularities they induce. This work presents an application of this approach to the discovery of a so-called visual field as the set of regularities that a visual sensor imposes on a naive agent’s experience. A formalism is proposed to describe how those regularities can be captured in a sensorimotor predictive model. Finally, the approach is evaluated on a simulated system coarsely inspired from the human retina.
Tasks
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01871v1
PDF	http://arxiv.org/pdf/1810.01871v1.pdf
PWC	https://paperswithcode.com/paper/grounding-the-experience-of-a-visual-field
Repo
Framework

On the practice of classification learning for clinical diagnosis and therapy advice in oncology


Title	On the practice of classification learning for clinical diagnosis and therapy advice in oncology
Authors	Flavio S Correa da Silva, Frederico P Costa, Antonio F Iemma
Abstract	Artificial intelligence and medicine have a longstanding and proficuous relationship. In the present work we develop a brief assessment of this relationship with specific focus on machine learning, in which we highlight some critical points which may hinder the use of machine learning techniques for clinical diagnosis and therapy advice in practice. We then suggest a conceptual framework to build successful systems to aid clinical diagnosis and therapy advice, grounded on a novel concept we have coined drifting domains. We focus on oncology to build our arguments, as this area of medicine furnishes strong evidence for the critical points we take into account here.
Tasks
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04854v1
PDF	http://arxiv.org/pdf/1811.04854v1.pdf
PWC	https://paperswithcode.com/paper/on-the-practice-of-classification-learning
Repo
Framework

Attack RMSE Leaderboard: An Introduction and Case Study


Title	Attack RMSE Leaderboard: An Introduction and Case Study
Authors	Cong Xie
Abstract	In this manuscript, we briefly introduce several tricks to climb the leaderboards which use RMSE for evaluation without exploiting any training data.
Tasks
Published	2018-02-14
URL	http://arxiv.org/abs/1802.04947v1
PDF	http://arxiv.org/pdf/1802.04947v1.pdf
PWC	https://paperswithcode.com/paper/attack-rmse-leaderboard-an-introduction-and
Repo
Framework

Explainable Black-Box Attacks Against Model-based Authentication


Title	Explainable Black-Box Attacks Against Model-based Authentication
Authors	Washington Garcia, Joseph I. Choi, Suman K. Adari, Somesh Jha, Kevin R. B. Butler
Abstract	Establishing unique identities for both humans and end systems has been an active research problem in the security community, giving rise to innovative machine learning-based authentication techniques. Although such techniques offer an automated method to establish identity, they have not been vetted against sophisticated attacks that target their core machine learning technique. This paper demonstrates that mimicking the unique signatures generated by host fingerprinting and biometric authentication systems is possible. We expose the ineffectiveness of underlying machine learning classification models by constructing a blind attack based around the query synthesis framework and utilizing Explainable-AI (XAI) techniques. We launch an attack in under 130 queries on a state-of-the-art face authentication system, and under 100 queries on a host authentication system. We examine how these attacks can be defended against and explore their limitations. XAI provides an effective means for adversaries to infer decision boundaries and provides a new way forward in constructing attacks against systems using machine learning models for authentication.
Tasks
Published	2018-09-28
URL	http://arxiv.org/abs/1810.00024v1
PDF	http://arxiv.org/pdf/1810.00024v1.pdf
PWC	https://paperswithcode.com/paper/explainable-black-box-attacks-against-model
Repo
Framework

Cyberbullying Detection – Technical Report 2/2018, Department of Computer Science AGH, University of Science and Technology


Title	Cyberbullying Detection – Technical Report 2/2018, Department of Computer Science AGH, University of Science and Technology
Authors	Michał Ptaszyński, Gniewosz Leliwa, Mateusz Piech, Aleksander Smywiński-Pohl
Abstract	The research described in this paper concerns automatic cyberbullying detection in social media. There are two goals to achieve: building a gold standard cyberbullying detection dataset and measuring the performance of the Samurai cyberbullying detection system. The Formspring dataset provided in a Kaggle competition was re-annotated as a part of the research. The annotation procedure is described in detail and, unlike many other recent data annotation initiatives, does not use Mechanical Turk for finding people willing to perform the annotation. The new annotation compared to the old one seems to be more coherent since all tested cyberbullying detection system performed better on the former. The performance of the Samurai system is compared with 5 commercial systems and one well-known machine learning algorithm, used for classifying textual content, namely Fasttext. It turns out that Samurai scores the best in all measures (accuracy, precision and recall), while Fasttext is the second-best performing algorithm.
Tasks
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00926v1
PDF	http://arxiv.org/pdf/1808.00926v1.pdf
PWC	https://paperswithcode.com/paper/cyberbullying-detection-technical-report
Repo
Framework

Noisy Expectation-Maximization: Applications and Generalizations


Title	Noisy Expectation-Maximization: Applications and Generalizations
Authors	Osonde Osoba, Bart Kosko
Abstract	We present a noise-injected version of the Expectation-Maximization (EM) algorithm: the Noisy Expectation Maximization (NEM) algorithm. The NEM algorithm uses noise to speed up the convergence of the EM algorithm. The NEM theorem shows that injected noise speeds up the average convergence of the EM algorithm to a local maximum of the likelihood surface if a positivity condition holds. The generalized form of the noisy expectation-maximization (NEM) algorithm allow for arbitrary modes of noise injection including adding and multiplying noise to the data. We demonstrate these noise benefits on EM algorithms for the Gaussian mixture model (GMM) with both additive and multiplicative NEM noise injection. A separate theorem (not presented here) shows that the noise benefit for independent identically distributed additive noise decreases with sample size in mixture models. This theorem implies that the noise benefit is most pronounced if the data is sparse. Injecting blind noise only slowed convergence.
Tasks
Published	2018-01-12
URL	http://arxiv.org/abs/1801.04053v1
PDF	http://arxiv.org/pdf/1801.04053v1.pdf
PWC	https://paperswithcode.com/paper/noisy-expectation-maximization-applications
Repo
Framework

Diffusion Scattering Transforms on Graphs


Title	Diffusion Scattering Transforms on Graphs
Authors	Fernando Gama, Alejandro Ribeiro, Joan Bruna
Abstract	Stability is a key aspect of data analysis. In many applications, the natural notion of stability is geometric, as illustrated for example in computer vision. Scattering transforms construct deep convolutional representations which are certified stable to input deformations. This stability to deformations can be interpreted as stability with respect to changes in the metric structure of the domain. In this work, we show that scattering transforms can be generalized to non-Euclidean domains using diffusion wavelets, while preserving a notion of stability with respect to metric changes in the domain, measured with diffusion maps. The resulting representation is stable to metric perturbations of the domain while being able to capture “high-frequency” information, akin to the Euclidean Scattering.
Tasks
Published	2018-06-22
URL	http://arxiv.org/abs/1806.08829v2
PDF	http://arxiv.org/pdf/1806.08829v2.pdf
PWC	https://paperswithcode.com/paper/diffusion-scattering-transforms-on-graphs
Repo
Framework

Image Generation from Layout


Title	Image Generation from Layout
Authors	Bo Zhao, Lili Meng, Weidong Yin, Leonid Sigal
Abstract	Despite significant recent progress on generative models, controlled generation of images depicting multiple and complex object layouts is still a difficult problem. Among the core challenges are the diversity of appearance a given object may possess and, as a result, exponential set of images consistent with a specified layout. To address these challenges, we propose a novel approach for layout-based image generation; we call it Layout2Im. Given the coarse spatial layout (bounding boxes + object categories), our model can generate a set of realistic images which have the correct objects in the desired locations. The representation of each object is disentangled into a specified/certain part (category) and an unspecified/uncertain part (appearance). The category is encoded using a word embedding and the appearance is distilled into a low-dimensional vector sampled from a normal distribution. Individual object representations are composed together using convolutional LSTM, to obtain an encoding of the complete layout, and then decoded to an image. Several loss terms are introduced to encourage accurate and diverse generation. The proposed Layout2Im model significantly outperforms the previous state of the art, boosting the best reported inception score by 24.66% and 28.57% on the very challenging COCO-Stuff and Visual Genome datasets, respectively. Extensive experiments also demonstrate our method’s ability to generate complex and diverse images with multiple objects.
Tasks	Layout-to-Image Generation
Published	2018-11-28
URL	https://arxiv.org/abs/1811.11389v3
PDF	https://arxiv.org/pdf/1811.11389v3.pdf
PWC	https://paperswithcode.com/paper/image-generation-from-layout
Repo
Framework

Safe Motion Planning in Unknown Environments: Optimality Benchmarks and Tractable Policies


Title	Safe Motion Planning in Unknown Environments: Optimality Benchmarks and Tractable Policies
Authors	Lucas Janson, Tommy Hu, Marco Pavone
Abstract	This paper addresses the problem of planning a safe (i.e., collision-free) trajectory from an initial state to a goal region when the obstacle space is a-priori unknown and is incrementally revealed online, e.g., through line-of-sight perception. Despite its ubiquitous nature, this formulation of motion planning has received relatively little theoretical investigation, as opposed to the setup where the environment is assumed known. A fundamental challenge is that, unlike motion planning with known obstacles, it is not even clear what an optimal policy to strive for is. Our contribution is threefold. First, we present a notion of optimality for safe planning in unknown environments in the spirit of comparative (as opposed to competitive) analysis, with the goal of obtaining a benchmark that is, at least conceptually, attainable. Second, by leveraging this theoretical benchmark, we derive a pseudo-optimal class of policies that can seamlessly incorporate any amount of prior or learned information while still guaranteeing the robot never collides. Finally, we demonstrate the practicality of our algorithmic approach in numerical experiments using a range of environment types and dynamics, including a comparison with a state of the art method. A key aspect of our framework is that it automatically and implicitly weighs exploration versus exploitation in a way that is optimal with respect to the information available.
Tasks	Motion Planning
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05804v1
PDF	http://arxiv.org/pdf/1804.05804v1.pdf
PWC	https://paperswithcode.com/paper/safe-motion-planning-in-unknown-environments
Repo
Framework