July 27, 2019

2813 words 14 mins read

Paper Group ANR 653

Sparse Coding on Stereo Video for Object Detection. A multi-branch convolutional neural network for detecting double JPEG compression. Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version). Opinion Polarization by Learning from Social Feedback. Multilevel Modeling with Structured Penalties for Classification from Im …

Sparse Coding on Stereo Video for Object Detection


Title	Sparse Coding on Stereo Video for Object Detection
Authors	Sheng Y. Lundquist, Melanie Mitchell, Garrett T. Kenyon
Abstract	Deep Convolutional Neural Networks (DCNN) require millions of labeled training examples for image classification and object detection tasks, which restrict these models to domains where such datasets are available. In this paper, we explore the use of unsupervised sparse coding applied to stereo-video data to help alleviate the need for large amounts of labeled data. We show that replacing a typical supervised convolutional layer with an unsupervised sparse-coding layer within a DCNN allows for better performance on a car detection task when only a limited number of labeled training examples is available. Furthermore, the network that incorporates sparse coding allows for more consistent performance over varying initializations and ordering of training examples when compared to a fully supervised DCNN. Finally, we compare activations between the unsupervised sparse-coding layer and the supervised convolutional layer, and show that the sparse representation exhibits an encoding that is depth selective, whereas encodings from the convolutional layer do not exhibit such selectivity. These result indicates promise for using unsupervised sparse-coding approaches in real-world computer vision tasks in domains with limited labeled training data.
Tasks	Image Classification, Object Detection
Published	2017-05-19
URL	http://arxiv.org/abs/1705.07144v2
PDF	http://arxiv.org/pdf/1705.07144v2.pdf
PWC	https://paperswithcode.com/paper/sparse-coding-on-stereo-video-for-object
Repo
Framework

A multi-branch convolutional neural network for detecting double JPEG compression


Title	A multi-branch convolutional neural network for detecting double JPEG compression
Authors	Bin Li, Hu Luo, Haoxin Zhang, Shunquan Tan, Zhongzhou Ji
Abstract	Detection of double JPEG compression is important to forensics analysis. A few methods were proposed based on convolutional neural networks (CNNs). These methods only accept inputs from pre-processed data, such as histogram features and/or decompressed images. In this paper, we present a CNN solution by using raw DCT (discrete cosine transformation) coefficients from JPEG images as input. Considering the DCT sub-band nature in JPEG, a multiple-branch CNN structure has been designed to reveal whether a JPEG format image has been doubly compressed. Comparing with previous methods, the proposed method provides end-to-end detection capability. Extensive experiments have been carried out to demonstrate the effectiveness of the proposed network.
Tasks
Published	2017-10-16
URL	http://arxiv.org/abs/1710.05477v1
PDF	http://arxiv.org/pdf/1710.05477v1.pdf
PWC	https://paperswithcode.com/paper/a-multi-branch-convolutional-neural-network
Repo
Framework

Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version)


Title	Towards Statistical Reasoning in Description Logics over Finite Domains (Full Version)
Authors	Rafael Peñaloza, Nico Potyka
Abstract	We present a probabilistic extension of the description logic $\mathcal{ALC}$ for reasoning about statistical knowledge. We consider conditional statements over proportions of the domain and are interested in the probabilistic-logical consequences of these proportions. After introducing some general reasoning problems and analyzing their properties, we present first algorithms and complexity results for reasoning in some fragments of Statistical $\mathcal{ALC}$.
Tasks
Published	2017-06-10
URL	http://arxiv.org/abs/1706.03207v1
PDF	http://arxiv.org/pdf/1706.03207v1.pdf
PWC	https://paperswithcode.com/paper/towards-statistical-reasoning-in-description
Repo
Framework


Title	Opinion Polarization by Learning from Social Feedback
Authors	Sven Banisch, Eckehard Olbrich
Abstract	We explore a new mechanism to explain polarization phenomena in opinion dynamics in which agents evaluate alternative views on the basis of the social feedback obtained on expressing them. High support of the favored opinion in the social environment, is treated as a positive feedback which reinforces the value associated to this opinion. In connected networks of sufficiently high modularity, different groups of agents can form strong convictions of competing opinions. Linking the social feedback process to standard equilibrium concepts we analytically characterize sufficient conditions for the stability of bi-polarization. While previous models have emphasized the polarization effects of deliberative argument-based communication, our model highlights an affective experience-based route to polarization, without assumptions about negative influence or bounded confidence.
Tasks
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02890v3
PDF	http://arxiv.org/pdf/1704.02890v3.pdf
PWC	https://paperswithcode.com/paper/opinion-polarization-by-learning-from-social
Repo
Framework

Multilevel Modeling with Structured Penalties for Classification from Imaging Genetics data


Title	Multilevel Modeling with Structured Penalties for Classification from Imaging Genetics data
Authors	Pascal Lu, Olivier Colliot
Abstract	In this paper, we propose a framework for automatic classification of patients from multimodal genetic and brain imaging data by optimally combining them. Additive models with unadapted penalties (such as the classical group lasso penalty or $L_1$-multiple kernel learning) treat all modalities in the same manner and can result in undesirable elimination of specific modalities when their contributions are unbalanced. To overcome this limitation, we introduce a multilevel model that combines imaging and genetics and that considers joint effects between these two modalities for diagnosis prediction. Furthermore, we propose a framework allowing to combine several penalties taking into account the structure of the different types of data, such as a group lasso penalty over the genetic modality and a $L_2$-penalty on imaging modalities. Finally , we propose a fast optimization algorithm, based on a proximal gradient method. The model has been evaluated on genetic (single nucleotide polymorphisms-SNP) and imaging (anatomical MRI measures) data from the ADNI database, and compared to additive models. It exhibits good performances in AD diagnosis; and at the same time, reveals relationships between genes, brain regions and the disease status.
Tasks
Published	2017-10-10
URL	http://arxiv.org/abs/1710.03627v1
PDF	http://arxiv.org/pdf/1710.03627v1.pdf
PWC	https://paperswithcode.com/paper/multilevel-modeling-with-structured-penalties
Repo
Framework

Dominance Move: A Measure of Comparing Solution Sets in Multiobjective Optimization


Title	Dominance Move: A Measure of Comparing Solution Sets in Multiobjective Optimization
Authors	Miqing Li, Xin Yao
Abstract	One of the most common approaches for multiobjective optimization is to generate a solution set that well approximates the whole Pareto-optimal frontier to facilitate the later decision-making process. However, how to evaluate and compare the quality of different solution sets remains challenging. Existing measures typically require additional problem knowledge and information, such as a reference point or a substituted set of the Pareto-optimal frontier. In this paper, we propose a quality measure, called dominance move (DoM), to compare solution sets generated by multiobjective optimizers. Given two solution sets, DoM measures the minimum sum of move distances for one set to weakly Pareto dominate the other set. DoM can be seen as a natural reflection of the difference between two solutions, capturing all aspects of solution sets’ quality, being compliant with Pareto dominance, and does not need any additional problem knowledge and parameters. We present an exact method to calculate the DoM in the biobjective case. We show the necessary condition of constructing the optimal partition for a solution set’s minimum move, and accordingly propose an efficient algorithm to recursively calculate the DoM. Finally, DoM is evaluated on several groups of artificial and real test cases as well as by a comparison with two well-established quality measures.
Tasks	Decision Making, Multiobjective Optimization
Published	2017-02-01
URL	http://arxiv.org/abs/1702.00477v1
PDF	http://arxiv.org/pdf/1702.00477v1.pdf
PWC	https://paperswithcode.com/paper/dominance-move-a-measure-of-comparing
Repo
Framework

Spatiotemporal Networks for Video Emotion Recognition


Title	Spatiotemporal Networks for Video Emotion Recognition
Authors	Lijie Fan, Yunjie Ke
Abstract	Our experiment adapts several popular deep learning methods as well as some traditional methods on the problem of video emotion recognition. In our experiment, we use the CNN-LSTM architecture for visual information extraction and classification and utilize traditional methods such as for audio feature classification. For multimodal fusion, we use the traditional Support Vector Machine. Our experiment yields a good result on the AFEW 6.0 Dataset.
Tasks	Emotion Recognition, Video Emotion Recognition
Published	2017-04-03
URL	http://arxiv.org/abs/1704.00570v3
PDF	http://arxiv.org/pdf/1704.00570v3.pdf
PWC	https://paperswithcode.com/paper/spatiotemporal-networks-for-video-emotion
Repo
Framework

Visual Discovery at Pinterest


Title	Visual Discovery at Pinterest
Authors	Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, Trevor Darrell
Abstract	Over the past three years Pinterest has experimented with several visual search and recommendation services, including Related Pins (2014), Similar Looks (2015), Flashlight (2016) and Lens (2017). This paper presents an overview of our visual discovery engine powering these services, and shares the rationales behind our technical and product decisions such as the use of object detection and interactive user interfaces. We conclude that this visual discovery engine significantly improves engagement in both search and recommendation tasks.
Tasks	Object Detection
Published	2017-02-15
URL	http://arxiv.org/abs/1702.04680v2
PDF	http://arxiv.org/pdf/1702.04680v2.pdf
PWC	https://paperswithcode.com/paper/visual-discovery-at-pinterest
Repo
Framework


Title	RGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning
Authors	Hao Chen, Y. F. Li, Dan Su
Abstract	In this work, we propose to utilize Convolutional Neural Networks to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the “data-hungry” nature of CNNs and the unavailability of sufficient labeled training data in depth modality. In the proposed approach, we leverage the auxiliary data from the source modality effectively by training the RGB saliency detection network to obtain the task-specific pre-understanding layers for the target modality. Meanwhile, we exploit the depth-specific information by pre-training a modality classification network that encourages modal-specific representations during the optimizing course. Thus, it could make the feature representations of the RGB and depth modalities as discriminative as possible. These two modules are pre-trained independently and then stitched to initialize and optimize the eventual depth-induced saliency detection model. Experiments demonstrate the effectiveness of the proposed novel pre-training strategy as well as the significant and consistent improvements of the proposed approach over other state-of-the-art methods.
Tasks	Object Detection, Saliency Detection, Salient Object Detection, Transfer Learning
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00122v2
PDF	http://arxiv.org/pdf/1703.00122v2.pdf
PWC	https://paperswithcode.com/paper/rgb-d-salient-object-detection-based-on
Repo
Framework

Viewpoint Selection for Photographing Architectures


Title	Viewpoint Selection for Photographing Architectures
Authors	Jingwu He, Linbo Wang, Wenzhe Zhou, Hongjie Zhang, Xiufen Cui, Yanwen Guo
Abstract	This paper studies the problem of how to choose good viewpoints for taking photographs of architectures. We achieve this by learning from professional photographs of world famous landmarks that are available on the Internet. Unlike previous efforts devoted to photo quality assessment which mainly rely on 2D image features, we show in this paper combining 2D image features extracted from images with 3D geometric features computed on the 3D models can result in more reliable evaluation of viewpoint quality. Specifically, we collect a set of photographs for each of 15 world famous architectures as well as their 3D models from the Internet. Viewpoint recovery for images is carried out through an image-model registration process, after which a newly proposed viewpoint clustering strategy is exploited to validate users’ viewpoint preferences when photographing landmarks. Finally, we extract a number of 2D and 3D features for each image based on multiple visual and geometric cues and perform viewpoint recommendation by learning from both 2D and 3D features using a specifically designed SVM-2K multi-view learner, achieving superior performance over using solely 2D or 3D features. We show the effectiveness of the proposed approach through extensive experiments. The experiments also demonstrate that our system can be used to recommend viewpoints for rendering textured 3D models of buildings for the use of architectural design, in addition to viewpoint evaluation of photographs and recommendation of viewpoints for photographing architectures in practice.
Tasks
Published	2017-03-06
URL	http://arxiv.org/abs/1703.01702v1
PDF	http://arxiv.org/pdf/1703.01702v1.pdf
PWC	https://paperswithcode.com/paper/viewpoint-selection-for-photographing
Repo
Framework

Generalization in Deep Learning


Title	Generalization in Deep Learning
Authors	Kenji Kawaguchi, Leslie Pack Kaelbling, Yoshua Bengio
Abstract	This paper provides non-vacuous and numerically-tight generalization guarantees for deep learning, as well as theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature. We also propose new open problems and discuss the limitations of our results.
Tasks
Published	2017-10-16
URL	https://arxiv.org/abs/1710.05468v5
PDF	https://arxiv.org/pdf/1710.05468v5.pdf
PWC	https://paperswithcode.com/paper/generalization-in-deep-learning
Repo
Framework

Algorithmically probable mutations reproduce aspects of evolution such as convergence rate, genetic memory, and modularity


Title	Algorithmically probable mutations reproduce aspects of evolution such as convergence rate, genetic memory, and modularity
Authors	Santiago Hernández-Orozco, Narsis A. Kiani, Hector Zenil
Abstract	Natural selection explains how life has evolved over millions of years from more primitive forms. The speed at which this happens, however, has sometimes defied formal explanations when based on random (uniformly distributed) mutations. Here we investigate the application of a simplicity bias based on a natural but algorithmic distribution of mutations (no recombination) in various examples, particularly binary matrices in order to compare evolutionary convergence rates. Results both on synthetic and on small biological examples indicate an accelerated rate when mutations are not statistical uniform but \textit{algorithmic uniform}. We show that algorithmic distributions can evolve modularity and genetic memory by preservation of structures when they first occur sometimes leading to an accelerated production of diversity but also population extinctions, possibly explaining naturally occurring phenomena such as diversity explosions (e.g. the Cambrian) and massive extinctions (e.g. the End Triassic) whose causes are currently a cause for debate. The natural approach introduced here appears to be a better approximation to biological evolution than models based exclusively upon random uniform mutations, and it also approaches a formal version of open-ended evolution based on previous formal results. These results validate some suggestions in the direction that computation may be an equally important driver of evolution. We also show that inducing the method on problems of optimization, such as genetic algorithms, has the potential to accelerate convergence of artificial evolutionary algorithms.
Tasks
Published	2017-09-01
URL	http://arxiv.org/abs/1709.00268v8
PDF	http://arxiv.org/pdf/1709.00268v8.pdf
PWC	https://paperswithcode.com/paper/algorithmically-probable-mutations-reproduce
Repo
Framework

Improving Max-Sum through Decimation to Solve Loopy Distributed Constraint Optimization Problems


Title	Improving Max-Sum through Decimation to Solve Loopy Distributed Constraint Optimization Problems
Authors	Jesús Cerquides, Rémi Emonet, Gauthier Picard, Juan A. Rodríguez-Aguilar
Abstract	In the context of solving large distributed constraint optimization problems (DCOP), belief-propagation and approximate inference algorithms are candidates of choice. However, in general, when the factor graph is very loopy (i.e. cyclic), these solution methods suffer from bad performance, due to non-convergence and many exchanged messages. As to improve performances of the Max-Sum inference algorithm when solving loopy constraint optimization problems, we propose here to take inspiration from the belief-propagation-guided dec-imation used to solve sparse random graphs (k-satisfiability). We propose the novel DeciMaxSum method, which is parameterized in terms of policies to decide when to trigger decimation, which variables to decimate, and which values to assign to decimated variables. Based on an empirical evaluation on a classical BP benchmark (the Ising model), some of these combinations of policies exhibit better performance than state-of-the-art competitors.
Tasks
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02209v1
PDF	http://arxiv.org/pdf/1706.02209v1.pdf
PWC	https://paperswithcode.com/paper/improving-max-sum-through-decimation-to-solve
Repo
Framework

Predicting Role Relevance with Minimal Domain Expertise in a Financial Domain


Title	Predicting Role Relevance with Minimal Domain Expertise in a Financial Domain
Authors	Mayank Kejriwal
Abstract	Word embeddings have made enormous inroads in recent years in a wide variety of text mining applications. In this paper, we explore a word embedding-based architecture for predicting the relevance of a role between two financial entities within the context of natural language sentences. In this extended abstract, we propose a pooled approach that uses a collection of sentences to train word embeddings using the skip-gram word2vec architecture. We use the word embeddings to obtain context vectors that are assigned one or more labels based on manual annotations. We train a machine learning classifier using the labeled context vectors, and use the trained classifier to predict contextual role relevance on test data. Our approach serves as a good minimal-expertise baseline for the task as it is simple and intuitive, uses open-source modules, requires little feature crafting effort and performs well across roles.
Tasks	Word Embeddings
Published	2017-04-19
URL	http://arxiv.org/abs/1704.05571v1
PDF	http://arxiv.org/pdf/1704.05571v1.pdf
PWC	https://paperswithcode.com/paper/predicting-role-relevance-with-minimal-domain
Repo
Framework

Entanglement Entropy of Target Functions for Image Classification and Convolutional Neural Network


Title	Entanglement Entropy of Target Functions for Image Classification and Convolutional Neural Network
Authors	Ya-Hui Zhang
Abstract	The success of deep convolutional neural network (CNN) in computer vision especially image classification problems requests a new information theory for function of image, instead of image itself. In this article, after establishing a deep mathematical connection between image classification problem and quantum spin model, we propose to use entanglement entropy, a generalization of classical Boltzmann-Shannon entropy, as a powerful tool to characterize the information needed for representation of general function of image. We prove that there is a sub-volume-law bound for entanglement entropy of target functions of reasonable image classification problems. Therefore target functions of image classification only occupy a small subspace of the whole Hilbert space. As a result, a neural network with polynomial number of parameters is efficient for representation of such target functions of image. The concept of entanglement entropy can also be useful to characterize the expressive power of different neural networks. For example, we show that to maintain the same expressive power, number of channels $D$ in a convolutional neural network should scale with the number of convolution layers $n_c$ as $D\sim D_0^{\frac{1}{n_c}}$. Therefore, deeper CNN with large $n_c$ is more efficient than shallow ones.
Tasks	Image Classification
Published	2017-10-16
URL	http://arxiv.org/abs/1710.05520v1
PDF	http://arxiv.org/pdf/1710.05520v1.pdf
PWC	https://paperswithcode.com/paper/entanglement-entropy-of-target-functions-for
Repo
Framework