October 21, 2019

2993 words 15 mins read

Paper Group AWR 141

FDFNet : A Secure Cancelable Deep Finger Dorsal Template Generation Network Secured via. Bio-Hashing. End-to-End Incremental Learning. Superpixel-enhanced Pairwise Conditional Random Field for Semantic Segmentation. Characterising epithelial tissues using persistent entropy. On Multi-Layer Basis Pursuit, Efficient Algorithms and Convolutional Neura …

FDFNet : A Secure Cancelable Deep Finger Dorsal Template Generation Network Secured via. Bio-Hashing


Title	FDFNet : A Secure Cancelable Deep Finger Dorsal Template Generation Network Secured via. Bio-Hashing
Authors	Avantika Singh, Ashish Arora, Shreya Hasmukh Patel, Gaurav Jaswal, Aditya Nigam
Abstract	Present world has already been consistently exploring the fine edges of online and digital world by imposing multiple challenging problems/scenarios. Similar to physical world, personal identity management is very crucial in-order to provide any secure online system. Last decade has seen a lot of work in this area using biometrics such as face, fingerprint, iris etc. Still there exist several vulnerabilities and one should have to address the problem of compromised biometrics much more seriously, since they cannot be modified easily once compromised. In this work, we have proposed a secure cancelable finger dorsal template generation network (learning domain specific features) secured via. Bio-Hashing. Proposed system effectively protects the original finger dorsal images by withdrawing compromised template and reassigning the new one. A novel Finger-Dorsal Feature Extraction Net (FDFNet) has been proposed for extracting the discriminative features. This network is exclusively trained on trait specific features without using any kind of pre-trained architecture. Later Bio-Hashing, a technique based on assigning a tokenized random number to each user, has been used to hash the features extracted from FDFNet. To test the performance of the proposed architecture, we have tested it over two benchmark public finger knuckle datasets: PolyU FKP and PolyU Contactless FKI. The experimental results shows the effectiveness of the proposed system in terms of security and accuracy.
Tasks
Published	2018-12-13
URL	http://arxiv.org/abs/1812.05308v1
PDF	http://arxiv.org/pdf/1812.05308v1.pdf
PWC	https://paperswithcode.com/paper/fdfnet-a-secure-cancelable-deep-finger-dorsal
Repo	https://github.com/ashisharora010/FDFNet
Framework	tf

End-to-End Incremental Learning


Title	End-to-End Incremental Learning
Authors	Francisco M. Castro, Manuel J. Marín-Jiménez, Nicolás Guil, Cordelia Schmid, Karteek Alahari
Abstract	Although deep learning approaches have stood out in recent years due to their state-of-the-art results, they continue to suffer from catastrophic forgetting, a dramatic decrease in overall performance when training with new classes added incrementally. This is due to current neural network architectures requiring the entire dataset, consisting of all the samples from the old as well as the new classes, to update the model -a requirement that becomes easily unsustainable as the number of classes grows. We address this issue with our approach to learn deep neural networks incrementally, using new data and only a small exemplar set corresponding to samples from the old classes. This is based on a loss composed of a distillation measure to retain the knowledge acquired from the old classes, and a cross-entropy loss to learn the new classes. Our incremental training is achieved while keeping the entire framework end-to-end, i.e., learning the data representation and the classifier jointly, unlike recent methods with no such guarantees. We evaluate our method extensively on the CIFAR-100 and ImageNet (ILSVRC 2012) image classification datasets, and show state-of-the-art performance.
Tasks	Image Classification
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09536v2
PDF	http://arxiv.org/pdf/1807.09536v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-incremental-learning
Repo	https://github.com/kibok90/iccv2019-inc
Framework	pytorch

Superpixel-enhanced Pairwise Conditional Random Field for Semantic Segmentation


Title	Superpixel-enhanced Pairwise Conditional Random Field for Semantic Segmentation
Authors	Li Sulimowicz, Ishfaq Ahmad, Alexander Aved
Abstract	Superpixel-based Higher-order Conditional Random Fields (CRFs) are effective in enforcing long-range consistency in pixel-wise labeling problems, such as semantic segmentation. However, their major short coming is considerably longer time to learn higher-order potentials and extra hyperparameters and/or weights compared with pairwise models. This paper proposes a superpixel-enhanced pairwise CRF framework that consists of the conventional pairwise as well as our proposed superpixel-enhanced pairwise (SP-Pairwise) potentials. SP-Pairwise potentials incorporate the superpixel-based higher-order cues by conditioning on a segment filtered image and share the same set of parameters as the conventional pairwise potentials. Therefore, the proposed superpixel-enhanced pairwise CRF has a lower time complexity in parameter learning and at the same time it outperforms higher-order CRF in terms of inference accuracy. Moreover, the new scheme takes advantage of the pre-trained pairwise models by reusing their parameters and/or weights, which provides a significant accuracy boost on the basis of CRF-RNN even without training. Experiments on MSRC-21 and PASCAL VOC 2012 dataset confirm the effectiveness of our method.
Tasks	Semantic Segmentation
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11737v1
PDF	http://arxiv.org/pdf/1805.11737v1.pdf
PWC	https://paperswithcode.com/paper/superpixel-enhanced-pairwise-conditional
Repo	https://github.com/liyin2015/superpixel_crfasrnn
Framework	tf

Characterising epithelial tissues using persistent entropy


Title	Characterising epithelial tissues using persistent entropy
Authors	N. Atienza, L. M. Escudero, M. J. Jimenez, M. Soriano-Trigueros
Abstract	In this paper, we apply persistent entropy, a novel topological statistic, for characterization of images of epithelial tissues. We have found out that persistent entropy is able to summarize topological and geometric information encoded by \alpha-complexes and persistent homology. After using some statistical tests, we can guarantee the existence of significant differences in the studied tissues.
Tasks
Published	2018-10-13
URL	http://arxiv.org/abs/1810.05835v1
PDF	http://arxiv.org/pdf/1810.05835v1.pdf
PWC	https://paperswithcode.com/paper/characterising-epithelial-tissues-using
Repo	https://github.com/Cimagroup/Persistent-Entropy-and-Epithelial-Tissues
Framework	none

On Multi-Layer Basis Pursuit, Efficient Algorithms and Convolutional Neural Networks


Title	On Multi-Layer Basis Pursuit, Efficient Algorithms and Convolutional Neural Networks
Authors	Jeremias Sulam, Aviad Aberdam, Amir Beck, Michael Elad
Abstract	Parsimonious representations are ubiquitous in modeling and processing information. Motivated by the recent Multi-Layer Convolutional Sparse Coding (ML-CSC) model, we herein generalize the traditional Basis Pursuit problem to a multi-layer setting, introducing similar sparse enforcing penalties at different representation layers in a symbiotic relation between synthesis and analysis sparse priors. We explore different iterative methods to solve this new problem in practice, and we propose a new Multi-Layer Iterative Soft Thresholding Algorithm (ML-ISTA), as well as a fast version (ML-FISTA). We show that these nested first order algorithms converge, in the sense that the function value of near-fixed points can get arbitrarily close to the solution of the original problem. We further show how these algorithms effectively implement particular recurrent convolutional neural networks (CNNs) that generalize feed-forward ones without introducing any parameters. We present and analyze different architectures resulting unfolding the iterations of the proposed pursuit algorithms, including a new Learned ML-ISTA, providing a principled way to construct deep recurrent CNNs. Unlike other similar constructions, these architectures unfold a global pursuit holistically for the entire network. We demonstrate the emerging constructions in a supervised learning setting, consistently improving the performance of classical CNNs while maintaining the number of parameters constant.
Tasks
Published	2018-06-02
URL	http://arxiv.org/abs/1806.00701v5
PDF	http://arxiv.org/pdf/1806.00701v5.pdf
PWC	https://paperswithcode.com/paper/on-multi-layer-basis-pursuit-efficient
Repo	https://github.com/jsulam/ml-ista
Framework	pytorch

The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale


Title	The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
Authors	Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, Tom Duerig, Vittorio Ferrari
Abstract	We present Open Images V4, a dataset of 9.2M images with unified annotations for image classification, object detection and visual relationship detection. The images have a Creative Commons Attribution license that allows to share and adapt the material, and they have been collected from Flickr without a predefined list of class names or tags, leading to natural class statistics and avoiding an initial design bias. Open Images V4 offers large scale across several dimensions: 30.1M image-level labels for 19.8k concepts, 15.4M bounding boxes for 600 object classes, and 375k visual relationship annotations involving 57 classes. For object detection in particular, we provide 15x more bounding boxes than the next largest datasets (15.4M boxes on 1.9M images). The images often show complex scenes with several objects (8 annotated objects per image on average). We annotated visual relationships between them, which support visual relationship detection, an emerging task that requires structured reasoning. We provide in-depth comprehensive statistics about the dataset, we validate the quality of the annotations, we study how the performance of several modern models evolves with increasing amounts of training data, and we demonstrate two applications made possible by having unified annotations of multiple types coexisting in the same images. We hope that the scale, quality, and variety of Open Images V4 will foster further research and innovation even beyond the areas of image classification, object detection, and visual relationship detection.
Tasks	Image Classification, Object Detection
Published	2018-11-02
URL	https://arxiv.org/abs/1811.00982v2
PDF	https://arxiv.org/pdf/1811.00982v2.pdf
PWC	https://paperswithcode.com/paper/the-open-images-dataset-v4-unified-image
Repo	https://github.com/ccc013/DeepLearning_Notes
Framework	tf

IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis


Title	IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis
Authors	Huaibo Huang, Zhihang Li, Ran He, Zhenan Sun, Tieniu Tan
Abstract	We present a novel introspective variational autoencoder (IntroVAE) model for synthesizing high-resolution photographic images. IntroVAE is capable of self-evaluating the quality of its generated samples and improving itself accordingly. Its inference and generator models are jointly trained in an introspective way. On one hand, the generator is required to reconstruct the input images from the noisy outputs of the inference model as normal VAEs. On the other hand, the inference model is encouraged to classify between the generated and real samples while the generator tries to fool it as GANs. These two famous generative frameworks are integrated in a simple yet efficient single-stream architecture that can be trained in a single stage. IntroVAE preserves the advantages of VAEs, such as stable training and nice latent manifold. Unlike most other hybrid models of VAEs and GANs, IntroVAE requires no extra discriminators, because the inference model itself serves as a discriminator to distinguish between the generated and real samples. Experiments demonstrate that our method produces high-resolution photo-realistic images (e.g., CELEBA images at (1024^{2})), which are comparable to or better than the state-of-the-art GANs.
Tasks	Image Generation
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06358v2
PDF	http://arxiv.org/pdf/1807.06358v2.pdf
PWC	https://paperswithcode.com/paper/introvae-introspective-variational
Repo	https://github.com/bbeatrix/introvae
Framework	tf

Cascaded Mutual Modulation for Visual Reasoning


Title	Cascaded Mutual Modulation for Visual Reasoning
Authors	Yiqun Yao, Jiaming Xu, Feng Wang, Bo Xu
Abstract	Visual reasoning is a special visual question answering problem that is multi-step and compositional by nature, and also requires intensive text-vision interactions. We propose CMM: Cascaded Mutual Modulation as a novel end-to-end visual reasoning model. CMM includes a multi-step comprehension process for both question and image. In each step, we use a Feature-wise Linear Modulation (FiLM) technique to enable textual/visual pipeline to mutually control each other. Experiments show that CMM significantly outperforms most related models, and reach state-of-the-arts on two visual reasoning benchmarks: CLEVR and NLVR, collected from both synthetic and natural languages. Ablation studies confirm that both our multistep framework and our visual-guided language modulation are critical to the task. Our code is available at https://github.com/FlamingHorizon/CMM-VR.
Tasks	Question Answering, Visual Question Answering, Visual Reasoning
Published	2018-09-06
URL	http://arxiv.org/abs/1809.01943v1
PDF	http://arxiv.org/pdf/1809.01943v1.pdf
PWC	https://paperswithcode.com/paper/cascaded-mutual-modulation-for-visual
Repo	https://github.com/FlamingHorizon/CMM-VR
Framework	pytorch

sCAKE: Semantic Connectivity Aware Keyword Extraction


Title	sCAKE: Semantic Connectivity Aware Keyword Extraction
Authors	Swagata Duari, Vasudha Bhatnagar
Abstract	Keyword Extraction is an important task in several text analysis endeavors. In this paper, we present a critical discussion of the issues and challenges ingraph-based keyword extraction methods, along with comprehensive empirical analysis. We propose a parameterless method for constructing graph of text that captures the contextual relation between words. A novel word scoring method is also proposed based on the connection between concepts. We demonstrate that both proposals are individually superior to those followed by the state-of-the-art graph-based keyword extraction algorithms. Combination of the proposed graph construction and scoring methods leads to a novel, parameterless keyword extraction method (sCAKE) based on semantic connectivity of words in the document. Motivated by limited availability of NLP tools for several languages, we also design and present a language-agnostic keyword extraction (LAKE) method. We eliminate the need of NLP tools by using a statistical filter to identify candidate keywords before constructing the graph. We show that the resulting method is a competent solution for extracting keywords from documents oflanguages lacking sophisticated NLP support.
Tasks	graph construction, Keyword Extraction
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10831v1
PDF	http://arxiv.org/pdf/1811.10831v1.pdf
PWC	https://paperswithcode.com/paper/181110831
Repo	https://github.com/SDuari/sCAKE-and-LAKE
Framework	none

Variational Autoencoder with Arbitrary Conditioning


Title	Variational Autoencoder with Arbitrary Conditioning
Authors	Oleg Ivanov, Michael Figurnov, Dmitry Vetrov
Abstract	We propose a single neural probabilistic model based on variational autoencoder that can be conditioned on an arbitrary subset of observed features and then sample the remaining features in “one shot”. The features may be both real-valued and categorical. Training of the model is performed by stochastic variational Bayes. The experimental evaluation on synthetic data, as well as feature imputation and image inpainting problems, shows the effectiveness of the proposed approach and diversity of the generated samples.
Tasks	Image Inpainting, Imputation
Published	2018-06-06
URL	https://arxiv.org/abs/1806.02382v3
PDF	https://arxiv.org/pdf/1806.02382v3.pdf
PWC	https://paperswithcode.com/paper/variational-autoencoder-with-arbitrary
Repo	https://github.com/tigvarts/vaeac
Framework	pytorch

Bayesian Inference with Anchored Ensembles of Neural Networks, and Application to Exploration in Reinforcement Learning


Title	Bayesian Inference with Anchored Ensembles of Neural Networks, and Application to Exploration in Reinforcement Learning
Authors	Tim Pearce, Nicolas Anastassacos, Mohamed Zaki, Andy Neely
Abstract	The use of ensembles of neural networks (NNs) for the quantification of predictive uncertainty is widespread. However, the current justification is intuitive rather than analytical. This work proposes one minor modification to the normal ensembling methodology, which we prove allows the ensemble to perform Bayesian inference, hence converging to the corresponding Gaussian Process as both the total number of NNs, and the size of each, tend to infinity. This working paper provides early-stage results in a reinforcement learning setting, analysing the practicality of the technique for an ensemble of small, finite number. Using the uncertainty estimates produced by anchored ensembles to govern the exploration-exploitation process results in steadier, more stable learning.
Tasks	Bayesian Inference
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11324v3
PDF	http://arxiv.org/pdf/1805.11324v3.pdf
PWC	https://paperswithcode.com/paper/bayesian-inference-with-anchored-ensembles-of
Repo	https://github.com/TeaPearce/Anchored_Ens_RL_Explore
Framework	tf

Explanations of model predictions with live and breakDown packages


Title	Explanations of model predictions with live and breakDown packages
Authors	Mateusz Staniak, Przemyslaw Biecek
Abstract	Complex models are commonly used in predictive modeling. In this paper we present R packages that can be used to explain predictions from complex black box models and attribute parts of these predictions to input features. We introduce two new approaches and corresponding packages for such attribution, namely live and breakDown. We also compare their results with existing implementations of state of the art solutions, namely lime that implements Locally Interpretable Model-agnostic Explanations and ShapleyR that implements Shapley values.
Tasks
Published	2018-04-05
URL	http://arxiv.org/abs/1804.01955v2
PDF	http://arxiv.org/pdf/1804.01955v2.pdf
PWC	https://paperswithcode.com/paper/explanations-of-model-predictions-with-live
Repo	https://github.com/cran/live
Framework	none

Evolving Event-driven Programs with SignalGP


Title	Evolving Event-driven Programs with SignalGP
Authors	Alexander Lalejini, Charles Ofria
Abstract	We present SignalGP, a new genetic programming (GP) technique designed to incorporate the event-driven programming paradigm into computational evolution’s toolbox. Event-driven programming is a software design philosophy that simplifies the development of reactive programs by automatically triggering program modules (event-handlers) in response to external events, such as signals from the environment or messages from other programs. SignalGP incorporates these concepts by extending existing tag-based referencing techniques into an event-driven context. Both events and functions are labeled with evolvable tags; when an event occurs, the function with the closest matching tag is triggered. In this work, we apply SignalGP in the context of linear GP. We demonstrate the value of the event-driven paradigm using two distinct test problems (an environment coordination problem and a distributed leader election problem) by comparing SignalGP to variants that are otherwise identical, but must actively use sensors to process events or messages. In each of these problems, rapid interaction with the environment or other agents is critical for maximizing fitness. We also discuss ways in which SignalGP can be generalized beyond our linear GP implementation.
Tasks
Published	2018-04-15
URL	http://arxiv.org/abs/1804.05445v1
PDF	http://arxiv.org/pdf/1804.05445v1.pdf
PWC	https://paperswithcode.com/paper/evolving-event-driven-programs-with-signalgp
Repo	https://github.com/amlalejini/GECCO-2018-Evolving-Event-driven-Programs-with-SignalGP
Framework	none

LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts


Title	LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts
Authors	Shuming Ma, Lei Cui, Damai Dai, Furu Wei, Xu Sun
Abstract	We introduce the task of automatic live commenting. Live commenting, which is also called video barrage', is an emerging feature on online video sites that allows real-time comments from viewers to fly across the screen like bullets or roll at the right side of the screen. The live comments are a mixture of opinions for the video and the chit chats with other comments. Automatic live commenting requires AI agents to comprehend the videos and interact with human viewers who also make the comments, so it is a good testbed of an AI agent's ability of dealing with both dynamic vision and language. In this work, we construct a large-scale live comment dataset with 2,361 videos and 895,929 live comments. Then, we introduce two neural models to generate live comments based on the visual and textual contexts, which achieve better performance than previous neural baselines such as the sequence-to-sequence model. Finally, we provide a retrieval-based evaluation protocol for automatic live commenting where the model is asked to sort a set of candidate comments based on the log-likelihood score, and evaluated on metrics such as mean-reciprocal-rank. Putting it all together, we demonstrate the first LiveBot’.
Tasks
Published	2018-09-13
URL	http://arxiv.org/abs/1809.04938v2
PDF	http://arxiv.org/pdf/1809.04938v2.pdf
PWC	https://paperswithcode.com/paper/livebot-generating-live-video-comments-based
Repo	https://github.com/lancopku/livebot
Framework	pytorch

STFT spectral loss for training a neural speech waveform model


Title	STFT spectral loss for training a neural speech waveform model
Authors	Shinji Takaki, Toru Nakashika, Xin Wang, Junichi Yamagishi
Abstract	This paper proposes a new loss using short-time Fourier transform (STFT) spectra for the aim of training a high-performance neural speech waveform model that predicts raw continuous speech waveform samples directly. Not only amplitude spectra but also phase spectra obtained from generated speech waveforms are used to calculate the proposed loss. We also mathematically show that training of the waveform model on the basis of the proposed loss can be interpreted as maximum likelihood training that assumes the amplitude and phase spectra of generated speech waveforms following Gaussian and von Mises distributions, respectively. Furthermore, this paper presents a simple network architecture as the speech waveform model, which is composed of uni-directional long short-term memories (LSTMs) and an auto-regressive structure. Experimental results showed that the proposed neural model synthesized high-quality speech waveforms.
Tasks
Published	2018-10-29
URL	http://arxiv.org/abs/1810.11945v2
PDF	http://arxiv.org/pdf/1810.11945v2.pdf
PWC	https://paperswithcode.com/paper/stft-spectral-loss-for-training-a-neural
Repo	https://github.com/nii-yamagishilab/TSNetVocoder
Framework	none