Paper Group ANR 653
Sparse Coding on Stereo Video for Object Detection. A multi-branch convolutional neural network for detecting double JPEG compression. Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version). Opinion Polarization by Learning from Social Feedback. Multilevel Modeling with Structured Penalties for Classification from Im …
Sparse Coding on Stereo Video for Object Detection
Title | Sparse Coding on Stereo Video for Object Detection |
Authors | Sheng Y. Lundquist, Melanie Mitchell, Garrett T. Kenyon |
Abstract | Deep Convolutional Neural Networks (DCNN) require millions of labeled training examples for image classification and object detection tasks, which restrict these models to domains where such datasets are available. In this paper, we explore the use of unsupervised sparse coding applied to stereo-video data to help alleviate the need for large amounts of labeled data. We show that replacing a typical supervised convolutional layer with an unsupervised sparse-coding layer within a DCNN allows for better performance on a car detection task when only a limited number of labeled training examples is available. Furthermore, the network that incorporates sparse coding allows for more consistent performance over varying initializations and ordering of training examples when compared to a fully supervised DCNN. Finally, we compare activations between the unsupervised sparse-coding layer and the supervised convolutional layer, and show that the sparse representation exhibits an encoding that is depth selective, whereas encodings from the convolutional layer do not exhibit such selectivity. These result indicates promise for using unsupervised sparse-coding approaches in real-world computer vision tasks in domains with limited labeled training data. |
Tasks | Image Classification, Object Detection |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07144v2 |
http://arxiv.org/pdf/1705.07144v2.pdf | |
PWC | https://paperswithcode.com/paper/sparse-coding-on-stereo-video-for-object |
Repo | |
Framework | |
A multi-branch convolutional neural network for detecting double JPEG compression
Title | A multi-branch convolutional neural network for detecting double JPEG compression |
Authors | Bin Li, Hu Luo, Haoxin Zhang, Shunquan Tan, Zhongzhou Ji |
Abstract | Detection of double JPEG compression is important to forensics analysis. A few methods were proposed based on convolutional neural networks (CNNs). These methods only accept inputs from pre-processed data, such as histogram features and/or decompressed images. In this paper, we present a CNN solution by using raw DCT (discrete cosine transformation) coefficients from JPEG images as input. Considering the DCT sub-band nature in JPEG, a multiple-branch CNN structure has been designed to reveal whether a JPEG format image has been doubly compressed. Comparing with previous methods, the proposed method provides end-to-end detection capability. Extensive experiments have been carried out to demonstrate the effectiveness of the proposed network. |
Tasks | |
Published | 2017-10-16 |
URL | http://arxiv.org/abs/1710.05477v1 |
http://arxiv.org/pdf/1710.05477v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-branch-convolutional-neural-network |
Repo | |
Framework | |
Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version)
Title | Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version) |
Authors | Rafael Peñaloza, Nico Potyka |
Abstract | We present a probabilistic extension of the description logic $\mathcal{ALC}$ for reasoning about statistical knowledge. We consider conditional statements over proportions of the domain and are interested in the probabilistic-logical consequences of these proportions. After introducing some general reasoning problems and analyzing their properties, we present first algorithms and complexity results for reasoning in some fragments of Statistical $\mathcal{ALC}$. |
Tasks | |
Published | 2017-06-10 |
URL | http://arxiv.org/abs/1706.03207v1 |
http://arxiv.org/pdf/1706.03207v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-statistical-reasoning-in-description |
Repo | |
Framework | |
Opinion Polarization by Learning from Social Feedback
Title | Opinion Polarization by Learning from Social Feedback |
Authors | Sven Banisch, Eckehard Olbrich |
Abstract | We explore a new mechanism to explain polarization phenomena in opinion dynamics in which agents evaluate alternative views on the basis of the social feedback obtained on expressing them. High support of the favored opinion in the social environment, is treated as a positive feedback which reinforces the value associated to this opinion. In connected networks of sufficiently high modularity, different groups of agents can form strong convictions of competing opinions. Linking the social feedback process to standard equilibrium concepts we analytically characterize sufficient conditions for the stability of bi-polarization. While previous models have emphasized the polarization effects of deliberative argument-based communication, our model highlights an affective experience-based route to polarization, without assumptions about negative influence or bounded confidence. |
Tasks | |
Published | 2017-04-07 |
URL | http://arxiv.org/abs/1704.02890v3 |
http://arxiv.org/pdf/1704.02890v3.pdf | |
PWC | https://paperswithcode.com/paper/opinion-polarization-by-learning-from-social |
Repo | |
Framework | |
Multilevel Modeling with Structured Penalties for Classification from Imaging Genetics data
Title | Multilevel Modeling with Structured Penalties for Classification from Imaging Genetics data |
Authors | Pascal Lu, Olivier Colliot |
Abstract | In this paper, we propose a framework for automatic classification of patients from multimodal genetic and brain imaging data by optimally combining them. Additive models with unadapted penalties (such as the classical group lasso penalty or $L_1$-multiple kernel learning) treat all modalities in the same manner and can result in undesirable elimination of specific modalities when their contributions are unbalanced. To overcome this limitation, we introduce a multilevel model that combines imaging and genetics and that considers joint effects between these two modalities for diagnosis prediction. Furthermore, we propose a framework allowing to combine several penalties taking into account the structure of the different types of data, such as a group lasso penalty over the genetic modality and a $L_2$-penalty on imaging modalities. Finally , we propose a fast optimization algorithm, based on a proximal gradient method. The model has been evaluated on genetic (single nucleotide polymorphisms-SNP) and imaging (anatomical MRI measures) data from the ADNI database, and compared to additive models. It exhibits good performances in AD diagnosis; and at the same time, reveals relationships between genes, brain regions and the disease status. |
Tasks | |
Published | 2017-10-10 |
URL | http://arxiv.org/abs/1710.03627v1 |
http://arxiv.org/pdf/1710.03627v1.pdf | |
PWC | https://paperswithcode.com/paper/multilevel-modeling-with-structured-penalties |
Repo | |
Framework | |
Dominance Move: A Measure of Comparing Solution Sets in Multiobjective Optimization
Title | Dominance Move: A Measure of Comparing Solution Sets in Multiobjective Optimization |
Authors | Miqing Li, Xin Yao |
Abstract | One of the most common approaches for multiobjective optimization is to generate a solution set that well approximates the whole Pareto-optimal frontier to facilitate the later decision-making process. However, how to evaluate and compare the quality of different solution sets remains challenging. Existing measures typically require additional problem knowledge and information, such as a reference point or a substituted set of the Pareto-optimal frontier. In this paper, we propose a quality measure, called dominance move (DoM), to compare solution sets generated by multiobjective optimizers. Given two solution sets, DoM measures the minimum sum of move distances for one set to weakly Pareto dominate the other set. DoM can be seen as a natural reflection of the difference between two solutions, capturing all aspects of solution sets’ quality, being compliant with Pareto dominance, and does not need any additional problem knowledge and parameters. We present an exact method to calculate the DoM in the biobjective case. We show the necessary condition of constructing the optimal partition for a solution set’s minimum move, and accordingly propose an efficient algorithm to recursively calculate the DoM. Finally, DoM is evaluated on several groups of artificial and real test cases as well as by a comparison with two well-established quality measures. |
Tasks | Decision Making, Multiobjective Optimization |
Published | 2017-02-01 |
URL | http://arxiv.org/abs/1702.00477v1 |
http://arxiv.org/pdf/1702.00477v1.pdf | |
PWC | https://paperswithcode.com/paper/dominance-move-a-measure-of-comparing |
Repo | |
Framework | |
Spatiotemporal Networks for Video Emotion Recognition
Title | Spatiotemporal Networks for Video Emotion Recognition |
Authors | Lijie Fan, Yunjie Ke |
Abstract | Our experiment adapts several popular deep learning methods as well as some traditional methods on the problem of video emotion recognition. In our experiment, we use the CNN-LSTM architecture for visual information extraction and classification and utilize traditional methods such as for audio feature classification. For multimodal fusion, we use the traditional Support Vector Machine. Our experiment yields a good result on the AFEW 6.0 Dataset. |
Tasks | Emotion Recognition, Video Emotion Recognition |
Published | 2017-04-03 |
URL | http://arxiv.org/abs/1704.00570v3 |
http://arxiv.org/pdf/1704.00570v3.pdf | |
PWC | https://paperswithcode.com/paper/spatiotemporal-networks-for-video-emotion |
Repo | |
Framework | |
Visual Discovery at Pinterest
Title | Visual Discovery at Pinterest |
Authors | Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, Trevor Darrell |
Abstract | Over the past three years Pinterest has experimented with several visual search and recommendation services, including Related Pins (2014), Similar Looks (2015), Flashlight (2016) and Lens (2017). This paper presents an overview of our visual discovery engine powering these services, and shares the rationales behind our technical and product decisions such as the use of object detection and interactive user interfaces. We conclude that this visual discovery engine significantly improves engagement in both search and recommendation tasks. |
Tasks | Object Detection |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04680v2 |
http://arxiv.org/pdf/1702.04680v2.pdf | |
PWC | https://paperswithcode.com/paper/visual-discovery-at-pinterest |
Repo | |
Framework | |
RGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning
Title | RGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning |
Authors | Hao Chen, Y. F. Li, Dan Su |
Abstract | In this work, we propose to utilize Convolutional Neural Networks to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the “data-hungry” nature of CNNs and the unavailability of sufficient labeled training data in depth modality. In the proposed approach, we leverage the auxiliary data from the source modality effectively by training the RGB saliency detection network to obtain the task-specific pre-understanding layers for the target modality. Meanwhile, we exploit the depth-specific information by pre-training a modality classification network that encourages modal-specific representations during the optimizing course. Thus, it could make the feature representations of the RGB and depth modalities as discriminative as possible. These two modules are pre-trained independently and then stitched to initialize and optimize the eventual depth-induced saliency detection model. Experiments demonstrate the effectiveness of the proposed novel pre-training strategy as well as the significant and consistent improvements of the proposed approach over other state-of-the-art methods. |
Tasks | Object Detection, Saliency Detection, Salient Object Detection, Transfer Learning |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00122v2 |
http://arxiv.org/pdf/1703.00122v2.pdf | |
PWC | https://paperswithcode.com/paper/rgb-d-salient-object-detection-based-on |
Repo | |
Framework | |
Viewpoint Selection for Photographing Architectures
Title | Viewpoint Selection for Photographing Architectures |
Authors | Jingwu He, Linbo Wang, Wenzhe Zhou, Hongjie Zhang, Xiufen Cui, Yanwen Guo |
Abstract | This paper studies the problem of how to choose good viewpoints for taking photographs of architectures. We achieve this by learning from professional photographs of world famous landmarks that are available on the Internet. Unlike previous efforts devoted to photo quality assessment which mainly rely on 2D image features, we show in this paper combining 2D image features extracted from images with 3D geometric features computed on the 3D models can result in more reliable evaluation of viewpoint quality. Specifically, we collect a set of photographs for each of 15 world famous architectures as well as their 3D models from the Internet. Viewpoint recovery for images is carried out through an image-model registration process, after which a newly proposed viewpoint clustering strategy is exploited to validate users’ viewpoint preferences when photographing landmarks. Finally, we extract a number of 2D and 3D features for each image based on multiple visual and geometric cues and perform viewpoint recommendation by learning from both 2D and 3D features using a specifically designed SVM-2K multi-view learner, achieving superior performance over using solely 2D or 3D features. We show the effectiveness of the proposed approach through extensive experiments. The experiments also demonstrate that our system can be used to recommend viewpoints for rendering textured 3D models of buildings for the use of architectural design, in addition to viewpoint evaluation of photographs and recommendation of viewpoints for photographing architectures in practice. |
Tasks | |
Published | 2017-03-06 |
URL | http://arxiv.org/abs/1703.01702v1 |
http://arxiv.org/pdf/1703.01702v1.pdf | |
PWC | https://paperswithcode.com/paper/viewpoint-selection-for-photographing |
Repo | |
Framework | |
Generalization in Deep Learning
Title | Generalization in Deep Learning |
Authors | Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio |
Abstract | This paper provides non-vacuous and numerically-tight generalization guarantees for deep learning, as well as theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature. We also propose new open problems and discuss the limitations of our results. |
Tasks | |
Published | 2017-10-16 |
URL | https://arxiv.org/abs/1710.05468v5 |
https://arxiv.org/pdf/1710.05468v5.pdf | |
PWC | https://paperswithcode.com/paper/generalization-in-deep-learning |
Repo | |
Framework | |
Algorithmically probable mutations reproduce aspects of evolution such as convergence rate, genetic memory, and modularity
Title | Algorithmically probable mutations reproduce aspects of evolution such as convergence rate, genetic memory, and modularity |
Authors | Santiago Hernández-Orozco, Narsis A. Kiani, Hector Zenil |
Abstract | Natural selection explains how life has evolved over millions of years from more primitive forms. The speed at which this happens, however, has sometimes defied formal explanations when based on random (uniformly distributed) mutations. Here we investigate the application of a simplicity bias based on a natural but algorithmic distribution of mutations (no recombination) in various examples, particularly binary matrices in order to compare evolutionary convergence rates. Results both on synthetic and on small biological examples indicate an accelerated rate when mutations are not statistical uniform but \textit{algorithmic uniform}. We show that algorithmic distributions can evolve modularity and genetic memory by preservation of structures when they first occur sometimes leading to an accelerated production of diversity but also population extinctions, possibly explaining naturally occurring phenomena such as diversity explosions (e.g. the Cambrian) and massive extinctions (e.g. the End Triassic) whose causes are currently a cause for debate. The natural approach introduced here appears to be a better approximation to biological evolution than models based exclusively upon random uniform mutations, and it also approaches a formal version of open-ended evolution based on previous formal results. These results validate some suggestions in the direction that computation may be an equally important driver of evolution. We also show that inducing the method on problems of optimization, such as genetic algorithms, has the potential to accelerate convergence of artificial evolutionary algorithms. |
Tasks | |
Published | 2017-09-01 |
URL | http://arxiv.org/abs/1709.00268v8 |
http://arxiv.org/pdf/1709.00268v8.pdf | |
PWC | https://paperswithcode.com/paper/algorithmically-probable-mutations-reproduce |
Repo | |
Framework | |
Improving Max-Sum through Decimation to Solve Loopy Distributed Constraint Optimization Problems
Title | Improving Max-Sum through Decimation to Solve Loopy Distributed Constraint Optimization Problems |
Authors | Jesús Cerquides, Rémi Emonet, Gauthier Picard, Juan A. Rodríguez-Aguilar |
Abstract | In the context of solving large distributed constraint optimization problems (DCOP), belief-propagation and approximate inference algorithms are candidates of choice. However, in general, when the factor graph is very loopy (i.e. cyclic), these solution methods suffer from bad performance, due to non-convergence and many exchanged messages. As to improve performances of the Max-Sum inference algorithm when solving loopy constraint optimization problems, we propose here to take inspiration from the belief-propagation-guided dec-imation used to solve sparse random graphs (k-satisfiability). We propose the novel DeciMaxSum method, which is parameterized in terms of policies to decide when to trigger decimation, which variables to decimate, and which values to assign to decimated variables. Based on an empirical evaluation on a classical BP benchmark (the Ising model), some of these combinations of policies exhibit better performance than state-of-the-art competitors. |
Tasks | |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02209v1 |
http://arxiv.org/pdf/1706.02209v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-max-sum-through-decimation-to-solve |
Repo | |
Framework | |
Predicting Role Relevance with Minimal Domain Expertise in a Financial Domain
Title | Predicting Role Relevance with Minimal Domain Expertise in a Financial Domain |
Authors | Mayank Kejriwal |
Abstract | Word embeddings have made enormous inroads in recent years in a wide variety of text mining applications. In this paper, we explore a word embedding-based architecture for predicting the relevance of a role between two financial entities within the context of natural language sentences. In this extended abstract, we propose a pooled approach that uses a collection of sentences to train word embeddings using the skip-gram word2vec architecture. We use the word embeddings to obtain context vectors that are assigned one or more labels based on manual annotations. We train a machine learning classifier using the labeled context vectors, and use the trained classifier to predict contextual role relevance on test data. Our approach serves as a good minimal-expertise baseline for the task as it is simple and intuitive, uses open-source modules, requires little feature crafting effort and performs well across roles. |
Tasks | Word Embeddings |
Published | 2017-04-19 |
URL | http://arxiv.org/abs/1704.05571v1 |
http://arxiv.org/pdf/1704.05571v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-role-relevance-with-minimal-domain |
Repo | |
Framework | |
Entanglement Entropy of Target Functions for Image Classification and Convolutional Neural Network
Title | Entanglement Entropy of Target Functions for Image Classification and Convolutional Neural Network |
Authors | Ya-Hui Zhang |
Abstract | The success of deep convolutional neural network (CNN) in computer vision especially image classification problems requests a new information theory for function of image, instead of image itself. In this article, after establishing a deep mathematical connection between image classification problem and quantum spin model, we propose to use entanglement entropy, a generalization of classical Boltzmann-Shannon entropy, as a powerful tool to characterize the information needed for representation of general function of image. We prove that there is a sub-volume-law bound for entanglement entropy of target functions of reasonable image classification problems. Therefore target functions of image classification only occupy a small subspace of the whole Hilbert space. As a result, a neural network with polynomial number of parameters is efficient for representation of such target functions of image. The concept of entanglement entropy can also be useful to characterize the expressive power of different neural networks. For example, we show that to maintain the same expressive power, number of channels $D$ in a convolutional neural network should scale with the number of convolution layers $n_c$ as $D\sim D_0^{\frac{1}{n_c}}$. Therefore, deeper CNN with large $n_c$ is more efficient than shallow ones. |
Tasks | Image Classification |
Published | 2017-10-16 |
URL | http://arxiv.org/abs/1710.05520v1 |
http://arxiv.org/pdf/1710.05520v1.pdf | |
PWC | https://paperswithcode.com/paper/entanglement-entropy-of-target-functions-for |
Repo | |
Framework | |