Paper Group ANR 338
A Siamese Deep Forest. A Large-scale Dataset and Benchmark for Similar Trademark Retrieval. Generalizing the Convolution Operator in Convolutional Neural Networks. FearNet: Brain-Inspired Model for Incremental Learning. Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue. Transfer learning for multi-center classi …
A Siamese Deep Forest
Title | A Siamese Deep Forest |
Authors | Lev V. Utkin, Mikhail A. Ryabinin |
Abstract | A Siamese Deep Forest (SDF) is proposed in the paper. It is based on the Deep Forest or gcForest proposed by Zhou and Feng and can be viewed as a gcForest modification. It can be also regarded as an alternative to the well-known Siamese neural networks. The SDF uses a modified training set consisting of concatenated pairs of vectors. Moreover, it defines the class distributions in the deep forest as the weighted sum of the tree class probabilities such that the weights are determined in order to reduce distances between similar pairs and to increase them between dissimilar points. We show that the weights can be obtained by solving a quadratic optimization problem. The SDF aims to prevent overfitting which takes place in neural networks when only limited training data are available. The numerical experiments illustrate the proposed distance metric method. |
Tasks | |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08715v1 |
http://arxiv.org/pdf/1704.08715v1.pdf | |
PWC | https://paperswithcode.com/paper/a-siamese-deep-forest |
Repo | |
Framework | |
A Large-scale Dataset and Benchmark for Similar Trademark Retrieval
Title | A Large-scale Dataset and Benchmark for Similar Trademark Retrieval |
Authors | Osman Tursun, Cemal Aker, Sinan Kalkan |
Abstract | Trademark retrieval (TR) has become an important yet challenging problem due to an ever increasing trend in trademark applications and infringement incidents. There have been many promising attempts for the TR problem, which, however, fell impracticable since they were evaluated with limited and mostly trivial datasets. In this paper, we provide a large-scale dataset with benchmark queries with which different TR approaches can be evaluated systematically. Moreover, we provide a baseline on this benchmark using the widely-used methods applied to TR in the literature. Furthermore, we identify and correct two important issues in TR approaches that were not addressed before: reversal of contrast, and presence of irrelevant text in trademarks severely affect the TR methods. Lastly, we applied deep learning, namely, several popular Convolutional Neural Network models, to the TR problem. To the best of the authors, this is the first attempt to do so. |
Tasks | Trademark Retrieval |
Published | 2017-01-20 |
URL | http://arxiv.org/abs/1701.05766v2 |
http://arxiv.org/pdf/1701.05766v2.pdf | |
PWC | https://paperswithcode.com/paper/a-large-scale-dataset-and-benchmark-for |
Repo | |
Framework | |
Generalizing the Convolution Operator in Convolutional Neural Networks
Title | Generalizing the Convolution Operator in Convolutional Neural Networks |
Authors | Kamaledin Ghiasi-Shirazi |
Abstract | Convolutional neural networks have become a main tool for solving many machine vision and machine learning problems. A major element of these networks is the convolution operator which essentially computes the inner product between a weight vector and the vectorized image patches extracted by sliding a window in the image planes of the previous layer. In this paper, we propose two classes of surrogate functions for the inner product operation inherent in the convolution operator and so attain two generalizations of the convolution operator. The first one is the class of positive definite kernel functions where their application is justified by the kernel trick. The second one is the class of similarity measures defined based on a distance function. We justify this by tracing back to the basic idea behind the neocognitron which is the ancestor of CNNs. Both methods are then further generalized by allowing a monotonically increasing function to be applied subsequently. Like any trainable parameter in a neural network, the template pattern and the parameters of the kernel/distance function are trained with the back-propagation algorithm. As an aside, we use the proposed framework to justify the use of sine activation function in CNNs. Our experiments on the MNIST dataset show that the performance of ordinary CNNs can be achieved by generalized CNNs based on weighted L1/L2 distances, proving the applicability of the proposed generalization of the convolutional neural networks. |
Tasks | |
Published | 2017-07-14 |
URL | http://arxiv.org/abs/1707.09864v1 |
http://arxiv.org/pdf/1707.09864v1.pdf | |
PWC | https://paperswithcode.com/paper/generalizing-the-convolution-operator-in |
Repo | |
Framework | |
FearNet: Brain-Inspired Model for Incremental Learning
Title | FearNet: Brain-Inspired Model for Incremental Learning |
Authors | Ronald Kemker, Christopher Kanan |
Abstract | Incremental class learning involves sequentially learning classes in bursts of examples from the same class. This violates the assumptions that underlie methods for training standard deep neural networks, and will cause them to suffer from catastrophic forgetting. Arguably, the best method for incremental class learning is iCaRL, but it requires storing training examples for each class, making it challenging to scale. Here, we propose FearNet for incremental class learning. FearNet is a generative model that does not store previous examples, making it memory efficient. FearNet uses a brain-inspired dual-memory system in which new memories are consolidated from a network for recent memories inspired by the mammalian hippocampal complex to a network for long-term storage inspired by medial prefrontal cortex. Memory consolidation is inspired by mechanisms that occur during sleep. FearNet also uses a module inspired by the basolateral amygdala for determining which memory system to use for recall. FearNet achieves state-of-the-art performance at incremental class learning on image (CIFAR-100, CUB-200) and audio classification (AudioSet) benchmarks. |
Tasks | Audio Classification |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10563v2 |
http://arxiv.org/pdf/1711.10563v2.pdf | |
PWC | https://paperswithcode.com/paper/fearnet-brain-inspired-model-for-incremental |
Repo | |
Framework | |
Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue
Title | Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue |
Authors | Amita Misra, Marilyn Walker |
Abstract | Research on the structure of dialogue has been hampered for years because large dialogue corpora have not been available. This has impacted the dialogue research community’s ability to develop better theories, as well as good off the shelf tools for dialogue processing. Happily, an increasing amount of information and opinion exchange occur in natural dialogue in online forums, where people share their opinions about a vast range of topics. In particular we are interested in rejection in dialogue, also called disagreement and denial, where the size of available dialogue corpora, for the first time, offers an opportunity to empirically test theoretical accounts of the expression and inference of rejection in dialogue. In this paper, we test whether topic-independent features motivated by theoretical predictions can be used to recognize rejection in online forums in a topic independent way. Our results show that our theoretically motivated features achieve 66% accuracy, an improvement over a unigram baseline of an absolute 6%. |
Tasks | |
Published | 2017-09-03 |
URL | http://arxiv.org/abs/1709.00661v1 |
http://arxiv.org/pdf/1709.00661v1.pdf | |
PWC | https://paperswithcode.com/paper/topic-independent-identification-of-agreement |
Repo | |
Framework | |
Transfer learning for multi-center classification of chronic obstructive pulmonary disease
Title | Transfer learning for multi-center classification of chronic obstructive pulmonary disease |
Authors | Veronika Cheplygina, Isabel Pino Peña, Jesper Holst Pedersen, David A. Lynch, Lauge Sørensen, Marleen de Bruijne |
Abstract | Chronic obstructive pulmonary disease (COPD) is a lung disease which can be quantified using chest computed tomography (CT) scans. Recent studies have shown that COPD can be automatically diagnosed using weakly supervised learning of intensity and texture distributions. However, up till now such classifiers have only been evaluated on scans from a single domain, and it is unclear whether they would generalize across domains, such as different scanners or scanning protocols. To address this problem, we investigate classification of COPD in a multi-center dataset with a total of 803 scans from three different centers, four different scanners, with heterogenous subject distributions. Our method is based on Gaussian texture features, and a weighted logistic classifier, which increases the weights of samples similar to the test data. We show that Gaussian texture features outperform intensity features previously used in multi-center classification tasks. We also show that a weighting strategy based on a classifier that is trained to discriminate between scans from different domains, can further improve the results. To encourage further research into transfer learning methods for classification of COPD, upon acceptance of the paper we will release two feature datasets used in this study on http://bigr.nl/research/projects/copd |
Tasks | Computed Tomography (CT), Transfer Learning |
Published | 2017-01-18 |
URL | http://arxiv.org/abs/1701.05013v2 |
http://arxiv.org/pdf/1701.05013v2.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-for-multi-center |
Repo | |
Framework | |
Geometry Processing of Conventionally Produced Mouse Brain Slice Images
Title | Geometry Processing of Conventionally Produced Mouse Brain Slice Images |
Authors | Nitin Agarwal, Xiangmin Xu, Gopi Meenakshisundaram |
Abstract | Brain mapping research in most neuroanatomical laboratories relies on conventional processing techniques, which often introduce histological artifacts such as tissue tears and tissue loss. In this paper we present techniques and algorithms for automatic registration and 3D reconstruction of conventionally produced mouse brain slices in a standardized atlas space. This is achieved first by constructing a virtual 3D mouse brain model from annotated slices of Allen Reference Atlas (ARA). Virtual re-slicing of the reconstructed model generates ARA-based slice images corresponding to the microscopic images of histological brain sections. These image pairs are aligned using a geometric approach through contour images. Histological artifacts in the microscopic images are detected and removed using Constrained Delaunay Triangulation before performing global alignment. Finally, non-linear registration is performed by solving Laplace’s equation with Dirichlet boundary conditions. Our methods provide significant improvements over previously reported registration techniques for the tested slices in 3D space, especially on slices with significant histological artifacts. Further, as an application we count the number of neurons in various anatomical regions using a dataset of 51 microscopic slices from a single mouse brain. This work represents a significant contribution to this subfield of neuroscience as it provides tools to neuroanatomist for analyzing and processing histological data. |
Tasks | 3D Reconstruction |
Published | 2017-12-27 |
URL | http://arxiv.org/abs/1712.09684v1 |
http://arxiv.org/pdf/1712.09684v1.pdf | |
PWC | https://paperswithcode.com/paper/geometry-processing-of-conventionally |
Repo | |
Framework | |
FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge
Title | FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge |
Authors | Michel F. Valstar, Enrique Sánchez-Lozano, Jeffrey F. Cohn, László A. Jeni, Jeffrey M. Girard, Zheng Zhang, Lijun Yin, Maja Pantic |
Abstract | The field of Automatic Facial Expression Analysis has grown rapidly in recent years. However, despite progress in new approaches as well as benchmarking efforts, most evaluations still focus on either posed expressions, near-frontal recordings, or both. This makes it hard to tell how existing expression recognition approaches perform under conditions where faces appear in a wide range of poses (or camera views), displaying ecologically valid expressions. The main obstacle for assessing this is the availability of suitable data, and the challenge proposed here addresses this limitation. The FG 2017 Facial Expression Recognition and Analysis challenge (FERA 2017) extends FERA 2015 to the estimation of Action Units occurrence and intensity under different camera views. In this paper we present the third challenge in automatic recognition of facial expressions, to be held in conjunction with the 12th IEEE conference on Face and Gesture Recognition, May 2017, in Washington, United States. Two sub-challenges are defined: the detection of AU occurrence, and the estimation of AU intensity. In this work we outline the evaluation protocol, the data used, and the results of a baseline method for both sub-challenges. |
Tasks | Facial Action Unit Detection, Facial Expression Recognition, Gesture Recognition |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04174v1 |
http://arxiv.org/pdf/1702.04174v1.pdf | |
PWC | https://paperswithcode.com/paper/fera-2017-addressing-head-pose-in-the-third |
Repo | |
Framework | |
Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation
Title | Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation |
Authors | Swami Sankaranarayanan, Yogesh Balaji, Arpit Jain, Ser Nam Lim, Rama Chellappa |
Abstract | Visual Domain Adaptation is a problem of immense importance in computer vision. Previous approaches showcase the inability of even deep neural networks to learn informative representations across domain shift. This problem is more severe for tasks where acquiring hand labeled data is extremely hard and tedious. In this work, we focus on adapting the representations learned by segmentation networks across synthetic and real domains. Contrary to previous approaches that use a simple adversarial objective or superpixel information to aid the process, we propose an approach based on Generative Adversarial Networks (GANs) that brings the embeddings closer in the learned feature space. To showcase the generality and scalability of our approach, we show that we can achieve state of the art results on two challenging scenarios of synthetic to real domain adaptation. Additional exploratory experiments show that our approach: (1) generalizes to unseen domains and (2) results in improved alignment of source and target distributions. |
Tasks | Domain Adaptation, Semantic Segmentation |
Published | 2017-11-19 |
URL | http://arxiv.org/abs/1711.06969v2 |
http://arxiv.org/pdf/1711.06969v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-synthetic-data-addressing |
Repo | |
Framework | |
On Ensuring that Intelligent Machines Are Well-Behaved
Title | On Ensuring that Intelligent Machines Are Well-Behaved |
Authors | Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, Emma Brunskill |
Abstract | Machine learning algorithms are everywhere, ranging from simple data analysis and pattern recognition tools used across the sciences to complex systems that achieve super-human performance on various tasks. Ensuring that they are well-behaved—that they do not, for example, cause harm to humans or act in a racist or sexist way—is therefore not a hypothetical problem to be dealt with in the future, but a pressing one that we address here. We propose a new framework for designing machine learning algorithms that simplifies the problem of specifying and regulating undesirable behaviors. To show the viability of this new framework, we use it to create new machine learning algorithms that preclude the sexist and harmful behaviors exhibited by standard machine learning algorithms in our experiments. Our framework for designing machine learning algorithms simplifies the safe and responsible application of machine learning. |
Tasks | |
Published | 2017-08-17 |
URL | http://arxiv.org/abs/1708.05448v1 |
http://arxiv.org/pdf/1708.05448v1.pdf | |
PWC | https://paperswithcode.com/paper/on-ensuring-that-intelligent-machines-are |
Repo | |
Framework | |
StackSeq2Seq: Dual Encoder Seq2Seq Recurrent Networks
Title | StackSeq2Seq: Dual Encoder Seq2Seq Recurrent Networks |
Authors | Alessandro Bay, Biswa Sengupta |
Abstract | A widely studied non-deterministic polynomial time (NP) hard problem lies in finding a route between the two nodes of a graph. Often meta-heuristics algorithms such as $A^{*}$ are employed on graphs with a large number of nodes. Here, we propose a deep recurrent neural network architecture based on the Sequence-2-Sequence (Seq2Seq) model, widely used, for instance in text translation. Particularly, we illustrate that utilising a context vector that has been learned from two different recurrent networks enables increased accuracies in learning the shortest route of a graph. Additionally, we show that one can boost the performance of the Seq2Seq network by smoothing the loss function using a homotopy continuation of the decoder’s loss function. |
Tasks | |
Published | 2017-10-11 |
URL | http://arxiv.org/abs/1710.04211v2 |
http://arxiv.org/pdf/1710.04211v2.pdf | |
PWC | https://paperswithcode.com/paper/stackseq2seq-dual-encoder-seq2seq-recurrent |
Repo | |
Framework | |
Highly Efficient Hierarchical Online Nonlinear Regression Using Second Order Methods
Title | Highly Efficient Hierarchical Online Nonlinear Regression Using Second Order Methods |
Authors | Burak C. Civek, Ibrahim Delibalta, Suleyman S. Kozat |
Abstract | We introduce highly efficient online nonlinear regression algorithms that are suitable for real life applications. We process the data in a truly online manner such that no storage is needed, i.e., the data is discarded after being used. For nonlinear modeling we use a hierarchical piecewise linear approach based on the notion of decision trees where the space of the regressor vectors is adaptively partitioned based on the performance. As the first time in the literature, we learn both the piecewise linear partitioning of the regressor space as well as the linear models in each region using highly effective second order methods, i.e., Newton-Raphson Methods. Hence, we avoid the well known over fitting issues by using piecewise linear models, however, since both the region boundaries as well as the linear models in each region are trained using the second order methods, we achieve substantial performance compared to the state of the art. We demonstrate our gains over the well known benchmark data sets and provide performance results in an individual sequence manner guaranteed to hold without any statistical assumptions. Hence, the introduced algorithms address computational complexity issues widely encountered in real life applications while providing superior guaranteed performance in a strong deterministic sense. |
Tasks | |
Published | 2017-01-18 |
URL | http://arxiv.org/abs/1701.05053v1 |
http://arxiv.org/pdf/1701.05053v1.pdf | |
PWC | https://paperswithcode.com/paper/highly-efficient-hierarchical-online |
Repo | |
Framework | |
Using Synthetic Data to Train Neural Networks is Model-Based Reasoning
Title | Using Synthetic Data to Train Neural Networks is Model-Based Reasoning |
Authors | Tuan Anh Le, Atilim Gunes Baydin, Robert Zinkov, Frank Wood |
Abstract | We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, model-based reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the synthetic-data generative model. We demonstrate this connection in a recognition task where we develop a novel Captcha-breaking architecture and train it using synthetic data, demonstrating both state-of-the-art performance and a way of computing task-specific posterior uncertainty. Using a neural network trained this way, we also demonstrate successful breaking of real-world Captchas currently used by Facebook and Wikipedia. Reasoning from these empirical results and drawing connections with Bayesian modeling, we discuss the robustness of synthetic data results and suggest important considerations for ensuring good neural network generalization when training with synthetic data. |
Tasks | |
Published | 2017-03-02 |
URL | http://arxiv.org/abs/1703.00868v1 |
http://arxiv.org/pdf/1703.00868v1.pdf | |
PWC | https://paperswithcode.com/paper/using-synthetic-data-to-train-neural-networks |
Repo | |
Framework | |
PCM-TV-TFV: A Novel Two Stage Framework for Image Reconstruction from Fourier Data
Title | PCM-TV-TFV: A Novel Two Stage Framework for Image Reconstruction from Fourier Data |
Authors | Weihong Guo, Guohui Song, Yue Zhang |
Abstract | We propose in this paper a novel two-stage Projection Correction Modeling (PCM) framework for image reconstruction from (non-uniform) Fourier measurements. PCM consists of a projection stage (P-stage) motivated by the multi-scale Galerkin method and a correction stage (C-stage) with an edge guided regularity fusing together the advantages of total variation (TV) and total fractional variation (TFV). The P-stage allows for continuous modeling of the underlying image of interest. The given measurements are projected onto a space in which the image is well represented. We then enhance the reconstruction result at the C-stage that minimizes an energy functional consisting of a fidelity in the transformed domain and a novel edge guided regularity. We further develop efficient proximal algorithms to solve the corresponding optimization problem. Various numerical results in both 1D signals and 2D images have also been presented to demonstrate the superior performance of the proposed two-stage method to other classical one-stage methods. |
Tasks | Image Reconstruction |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10784v1 |
http://arxiv.org/pdf/1705.10784v1.pdf | |
PWC | https://paperswithcode.com/paper/pcm-tv-tfv-a-novel-two-stage-framework-for |
Repo | |
Framework | |
Cascade Region Proposal and Global Context for Deep Object Detection
Title | Cascade Region Proposal and Global Context for Deep Object Detection |
Authors | Qiaoyong Zhong, Chao Li, Yingying Zhang, Di Xie, Shicai Yang, Shiliang Pu |
Abstract | Deep region-based object detector consists of a region proposal step and a deep object recognition step. In this paper, we make significant improvements on both of the two steps. For region proposal we propose a novel lightweight cascade structure which can effectively improve RPN proposal quality. For object recognition we re-implement global context modeling with a few modications and obtain a performance boost (4.2% mAP gain on the ILSVRC 2016 validation set). Besides, we apply the idea of pre-training extensively and show its importance in both steps. Together with common training and testing tricks, we improve Faster R-CNN baseline by a large margin. In particular, we obtain 87.9% mAP on the PASCAL VOC 2012 test set, 65.3% on the ILSVRC 2016 test set and 36.8% on the COCO test-std set. |
Tasks | Object Detection, Object Recognition |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.10749v1 |
http://arxiv.org/pdf/1710.10749v1.pdf | |
PWC | https://paperswithcode.com/paper/cascade-region-proposal-and-global-context |
Repo | |
Framework | |