Paper Group ANR 1035
Deep Neural Networks Learn Non-Smooth Functions Effectively. Dynamic Assortment Optimization with Changing Contextual Information. Stable Geodesic Update on Hyperbolic Space and its Application to Poincare Embeddings. Causal Explanation Analysis on Social Media. Joint Facade Registration and Segmentation for Urban Localization. Stochastic Variance- …
Deep Neural Networks Learn Non-Smooth Functions Effectively
Title | Deep Neural Networks Learn Non-Smooth Functions Effectively |
Authors | Masaaki Imaizumi, Kenji Fukumizu |
Abstract | We theoretically discuss why deep neural networks (DNNs) performs better than other models in some cases by investigating statistical properties of DNNs for non-smooth functions. While DNNs have empirically shown higher performance than other standard methods, understanding its mechanism is still a challenging problem. From an aspect of the statistical theory, it is known many standard methods attain the optimal rate of generalization errors for smooth functions in large sample asymptotics, and thus it has not been straightforward to find theoretical advantages of DNNs. This paper fills this gap by considering learning of a certain class of non-smooth functions, which was not covered by the previous theory. We derive the generalization error of estimators by DNNs with a ReLU activation, and show that convergence rates of the generalization by DNNs are almost optimal to estimate the non-smooth functions, while some of the popular models do not attain the optimal rate. In addition, our theoretical result provides guidelines for selecting an appropriate number of layers and edges of DNNs. We provide numerical experiments to support the theoretical results. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04474v2 |
http://arxiv.org/pdf/1802.04474v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-learn-non-smooth |
Repo | |
Framework | |
Dynamic Assortment Optimization with Changing Contextual Information
Title | Dynamic Assortment Optimization with Changing Contextual Information |
Authors | Xi Chen, Yining Wang, Yuan Zhou |
Abstract | In this paper, we study the dynamic assortment optimization problem under a finite selling season of length $T$. At each time period, the seller offers an arriving customer an assortment of substitutable products under a cardinality constraint, and the customer makes the purchase among offered products according to a discrete choice model. Most existing work associates each product with a real-valued fixed mean utility and assumes a multinomial logit choice (MNL) model. In many practical applications, feature/contexutal information of products is readily available. In this paper, we incorporate the feature information by assuming a linear relationship between the mean utility and the feature. In addition, we allow the feature information of products to change over time so that the underlying choice model can also be non-stationary. To solve the dynamic assortment optimization under this changing contextual MNL model, we need to simultaneously learn the underlying unknown coefficient and makes the decision on the assortment. To this end, we develop an upper confidence bound (UCB) based policy and establish the regret bound on the order of $\widetilde O(d\sqrt{T})$, where $d$ is the dimension of the feature and $\widetilde O$ suppresses logarithmic dependence. We further established the lower bound $\Omega(d\sqrt{T}/K)$ where $K$ is the cardinality constraint of an offered assortment, which is usually small. When $K$ is a constant, our policy is optimal up to logarithmic factors. In the exploitation phase of the UCB algorithm, we need to solve a combinatorial optimization for assortment optimization based on the learned information. We further develop an approximation algorithm and an efficient greedy heuristic. The effectiveness of the proposed policy is further demonstrated by our numerical studies. |
Tasks | Combinatorial Optimization |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1810.13069v2 |
http://arxiv.org/pdf/1810.13069v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-assortment-optimization-with-changing |
Repo | |
Framework | |
Stable Geodesic Update on Hyperbolic Space and its Application to Poincare Embeddings
Title | Stable Geodesic Update on Hyperbolic Space and its Application to Poincare Embeddings |
Authors | Yosuke Enokida, Atsushi Suzuki, Kenji Yamanishi |
Abstract | A hyperbolic space has been shown to be more capable of modeling complex networks than a Euclidean space. This paper proposes an explicit update rule along geodesics in a hyperbolic space. The convergence of our algorithm is theoretically guaranteed, and the convergence rate is better than the conventional Euclidean gradient descent algorithm. Moreover, our algorithm avoids the “bias” problem of existing methods using the Riemannian gradient. Experimental results demonstrate the good performance of our algorithm in the \Poincare embeddings of knowledge base data. |
Tasks | |
Published | 2018-05-26 |
URL | http://arxiv.org/abs/1805.10487v1 |
http://arxiv.org/pdf/1805.10487v1.pdf | |
PWC | https://paperswithcode.com/paper/stable-geodesic-update-on-hyperbolic-space |
Repo | |
Framework | |
Causal Explanation Analysis on Social Media
Title | Causal Explanation Analysis on Social Media |
Authors | Youngseo Son, Nipun Bayas, H. Andrew Schwartz |
Abstract | Understanding causal explanations - reasons given for happenings in one’s life - has been found to be an important psychological factor linked to physical and mental health. Causal explanations are often studied through manual identification of phrases over limited samples of personal writing. Automatic identification of causal explanations in social media, while challenging in relying on contextual and sequential cues, offers a larger-scale alternative to expensive manual ratings and opens the door for new applications (e.g. studying prevailing beliefs about causes, such as climate change). Here, we explore automating causal explanation analysis, building on discourse parsing, and presenting two novel subtasks: causality detection (determining whether a causal explanation exists at all) and causal explanation identification (identifying the specific phrase that is the explanation). We achieve strong accuracies for both tasks but find different approaches best: an SVM for causality prediction (F1 = 0.791) and a hierarchy of Bidirectional LSTMs for causal explanation identification (F1 = 0.853). Finally, we explore applications of our complete pipeline (F1 = 0.868), showing demographic differences in mentions of causal explanation and that the association between a word and sentiment can change when it is used within a causal explanation. |
Tasks | |
Published | 2018-09-04 |
URL | http://arxiv.org/abs/1809.01202v2 |
http://arxiv.org/pdf/1809.01202v2.pdf | |
PWC | https://paperswithcode.com/paper/causal-explanation-analysis-on-social-media |
Repo | |
Framework | |
Joint Facade Registration and Segmentation for Urban Localization
Title | Joint Facade Registration and Segmentation for Urban Localization |
Authors | Antoine Fond, Marie-Odile Berger, Gilles Simon |
Abstract | This paper presents an efficient approach for solving jointly facade registration and semantic segmentation. Progress in facade detection and recognition enable good initialization for the registration of a reference facade to a newly acquired target image. We propose here to rely on semantic segmentation to improve the accuracy of that initial registration. Simultaneously we aim to improve the quality of the semantic segmentation through the registration. These two problems are jointly solved in a Expectation-Maximization framework. We especially introduce a bayesian model that use prior semantic segmentation as well as geometric structure of the facade reference modeled by $L_p$ Gaussian Mixtures. We show the advantages of our method in term of robustness to clutter and change of illumination on urban images from various database. |
Tasks | Semantic Segmentation |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.10048v2 |
http://arxiv.org/pdf/1811.10048v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-facade-registration-and-segmentation |
Repo | |
Framework | |
Stochastic Variance-Reduced Policy Gradient
Title | Stochastic Variance-Reduced Policy Gradient |
Authors | Matteo Papini, Damiano Binaghi, Giuseppe Canonaco, Matteo Pirotta, Marcello Restelli |
Abstract | In this paper, we propose a novel reinforcement- learning algorithm consisting in a stochastic variance-reduced version of policy gradient for solving Markov Decision Processes (MDPs). Stochastic variance-reduced gradient (SVRG) methods have proven to be very successful in supervised learning. However, their adaptation to policy gradient is not straightforward and needs to account for I) a non-concave objective func- tion; II) approximations in the full gradient com- putation; and III) a non-stationary sampling pro- cess. The result is SVRPG, a stochastic variance- reduced policy gradient algorithm that leverages on importance weights to preserve the unbiased- ness of the gradient estimate. Under standard as- sumptions on the MDP, we provide convergence guarantees for SVRPG with a convergence rate that is linear under increasing batch sizes. Finally, we suggest practical variants of SVRPG, and we empirically evaluate them on continuous MDPs. |
Tasks | |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05618v1 |
http://arxiv.org/pdf/1806.05618v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-variance-reduced-policy-gradient |
Repo | |
Framework | |
Universal features of mountain ridge networks on Earth
Title | Universal features of mountain ridge networks on Earth |
Authors | Rafał Rak, Jarosław Kwapień, Paweł Oświęcimka, Paweł Zięba, Stanisław Drożdż |
Abstract | Compared to the heavily studied surface drainage systems, the mountain ridge systems have been a subject of less attention even on the empirical level, despite the fact that their structure is richer. To reduce this deficiency, we analyze different mountain ranges by means of a network approach and grasp some essential features of the ridge branching structure. We also employ a fractal analysis as it is especially suitable for describing properties of rough objects and surfaces. As our approach differs from typical analyses that are carried out in geophysics, we believe that it can initialize a research direction that will allow to shed more light on the processes that are responsible for landscape formation and will contribute to the network theory by indicating a need for the construction of new models of the network growth as no existing model can properly describe the ridge formation. We also believe that certain features of our study can offer help in the cartographic generalization. Specifically, we study structure of the ridge networks based on the empirical elevation data collected by SRTM. We consider mountain ranges from different geological periods and geographical locations. For each mountain range, we construct a simple topographic network representation (the ridge junctions are nodes) and a ridge representation (the ridges are nodes and the junctions are edges) and calculate the parameters characterizing their topology. We observe that the topographic networks inherit the fractal structure of the mountain ranges but do not show any other complex features. In contrast, the ridge networks, while lacking the proper fractality, reveal the power-law degree distributions with the exponent $1.6\le \beta \le 1.7$. By taking into account the fact that the analyzed mountains differ in many properties, these values seem to be universal for the earthly mountainous terrain. |
Tasks | |
Published | 2018-04-10 |
URL | https://arxiv.org/abs/1804.03457v3 |
https://arxiv.org/pdf/1804.03457v3.pdf | |
PWC | https://paperswithcode.com/paper/universal-features-of-mountain-ridge-patterns |
Repo | |
Framework | |
Facial Aging and Rejuvenation by Conditional Multi-Adversarial Autoencoder with Ordinal Regression
Title | Facial Aging and Rejuvenation by Conditional Multi-Adversarial Autoencoder with Ordinal Regression |
Authors | Haiping Zhu, Qi Zhou, Junping Zhang, James Z. Wang |
Abstract | Facial aging and facial rejuvenation analyze a given face photograph to predict a future look or estimate a past look of the person. To achieve this, it is critical to preserve human identity and the corresponding aging progression and regression with high accuracy. However, existing methods cannot simultaneously handle these two objectives well. We propose a novel generative adversarial network based approach, named the Conditional Multi-Adversarial AutoEncoder with Ordinal Regression (CMAAE-OR). It utilizes an age estimation technique to control the aging accuracy and takes a high-level feature representation to preserve personalized identity. Specifically, the face is first mapped to a latent vector through a convolutional encoder. The latent vector is then projected onto the face manifold conditional on the age through a deconvolutional generator. The latent vector preserves personalized face features and the age controls facial aging and rejuvenation. A discriminator and an ordinal regression are imposed on the encoder and the generator in tandem, making the generated face images to be more photorealistic while simultaneously exhibiting desirable aging effects. Besides, a high-level feature representation is utilized to preserve personalized identity of the generated face. Experiments on two benchmark datasets demonstrate appealing performance of the proposed method over the state-of-the-art. |
Tasks | Age Estimation |
Published | 2018-04-08 |
URL | http://arxiv.org/abs/1804.02740v1 |
http://arxiv.org/pdf/1804.02740v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-aging-and-rejuvenation-by-conditional |
Repo | |
Framework | |
Learning Combinations of Activation Functions
Title | Learning Combinations of Activation Functions |
Authors | Franco Manessi, Alessandro Rozza |
Abstract | In the last decade, an active area of research has been devoted to design novel activation functions that are able to help deep neural networks to converge, obtaining better performance. The training procedure of these architectures usually involves optimization of the weights of their layers only, while non-linearities are generally pre-specified and their (possible) parameters are usually considered as hyper-parameters to be tuned manually. In this paper, we introduce two approaches to automatically learn different combinations of base activation functions (such as the identity function, ReLU, and tanh) during the training phase. We present a thorough comparison of our novel approaches with well-known architectures (such as LeNet-5, AlexNet, and ResNet-56) on three standard datasets (Fashion-MNIST, CIFAR-10, and ILSVRC-2012), showing substantial improvements in the overall performance, such as an increase in the top-1 accuracy for AlexNet on ILSVRC-2012 of 3.01 percentage points. |
Tasks | |
Published | 2018-01-29 |
URL | http://arxiv.org/abs/1801.09403v3 |
http://arxiv.org/pdf/1801.09403v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-combinations-of-activation-functions |
Repo | |
Framework | |
Pyramid Attention Network for Semantic Segmentation
Title | Pyramid Attention Network for Semantic Segmentation |
Authors | Hanchao Li, Pengfei Xiong, Jie An, Lingxue Wang |
Abstract | A Pyramid Attention Network(PAN) is proposed to exploit the impact of global contextual information in semantic segmentation. Different from most existing works, we combine attention mechanism and spatial pyramid to extract precise dense features for pixel labeling instead of complicated dilated convolution and artificially designed decoder networks. Specifically, we introduce a Feature Pyramid Attention module to perform spatial pyramid attention structure on high-level output and combining global pooling to learn a better feature representation, and a Global Attention Upsample module on each decoder layer to provide global context as a guidance of low-level features to select category localization details. The proposed approach achieves state-of-the-art performance on PASCAL VOC 2012 and Cityscapes benchmarks with a new record of mIoU accuracy 84.0% on PASCAL VOC 2012, while training without COCO dataset. |
Tasks | Semantic Segmentation |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10180v3 |
http://arxiv.org/pdf/1805.10180v3.pdf | |
PWC | https://paperswithcode.com/paper/pyramid-attention-network-for-semantic |
Repo | |
Framework | |
An Additive Approximation to Multiplicative Noise
Title | An Additive Approximation to Multiplicative Noise |
Authors | Ruanui Nicholson, Jari P. Kaipio |
Abstract | Multiplicative noise models are often used instead of additive noise models in cases in which the noise variance depends on the state. Furthermore, when Poisson distributions with relatively small counts are approximated with normal distributions, multiplicative noise approximations are straightforward to implement. There are a number of limitations in existing approaches to marginalize over multiplicative errors, such as positivity of the multiplicative noise term. The focus in this paper is in large dimensional (inverse) problems for which sampling type approaches have too high computational complexity. In this paper, we propose an alternative approach to carry out approximative marginalization over the multiplicative error by embedding the statistics in an additive error term. The approach is essentially a Bayesian one in that the statistics of the additive error is induced by the statistics of the other unknowns. As an example, we consider a deconvolution problem on random fields with different statistics of the multiplicative noise. Furthermore, the approach allows for correlated multiplicative noise. We show that the proposed approach provides feasible error estimates in the sense that the posterior models support the actual image. |
Tasks | |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02344v1 |
http://arxiv.org/pdf/1805.02344v1.pdf | |
PWC | https://paperswithcode.com/paper/an-additive-approximation-to-multiplicative |
Repo | |
Framework | |
Light Weight Color Image Warping with Inter-Channel Information
Title | Light Weight Color Image Warping with Inter-Channel Information |
Authors | Chuangye Zhang, Yan Niu, Tieru Wu, Ximing Li |
Abstract | Image warping is a necessary step in many multimedia applications such as texture mapping, image-based rendering, panorama stitching, image resizing and optical flow computation etc. Traditionally, color image warping interpolation is performed in each color channel independently. In this paper, we show that the warping quality can be significantly enhanced by exploiting the cross-channel correlation. We design a warping scheme that integrates intra-channel interpolation with cross-channel variation at very low computational cost, which is required for interactive multimedia applications on mobile devices. The effectiveness and efficiency of our method are validated by extensive experiments. |
Tasks | Optical Flow Estimation |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.07763v1 |
http://arxiv.org/pdf/1812.07763v1.pdf | |
PWC | https://paperswithcode.com/paper/light-weight-color-image-warping-with-inter |
Repo | |
Framework | |
“Ge Shu Zhi Zhi”: Towards Deep Understanding about Worlds
Title | “Ge Shu Zhi Zhi”: Towards Deep Understanding about Worlds |
Authors | Baogang Hu, Weiming Dong |
Abstract | “Ge She Zhi Zhi” is a novel saying in Chinese, stated as “To investigate things from the underlying principle(s) and to acquire knowledge in the form of mathematical representations”. The saying is adopted and modified based on the ideas from the Eastern and Western philosophers. This position paper discusses the saying in the background of artificial intelligence (AI). Some related subjects, such as the ultimate goals of AI and two levels of knowledge representations, are discussed from the perspective of machine learning. A case study on objective evaluations over multi attributes, a typical problem in the filed of social computing, is given to support the saying for wide applications. A methodology of meta rules is proposed for examining the objectiveness of the evaluations. The possible problems of the saying are also presented. |
Tasks | |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1901.01834v3 |
https://arxiv.org/pdf/1901.01834v3.pdf | |
PWC | https://paperswithcode.com/paper/ge-shu-zhi-zhi-towards-deep-understanding |
Repo | |
Framework | |
On Learning Sparsely Used Dictionaries from Incomplete Samples
Title | On Learning Sparsely Used Dictionaries from Incomplete Samples |
Authors | Thanh V. Nguyen, Akshay Soni, Chinmay Hegde |
Abstract | Most existing algorithms for dictionary learning assume that all entries of the (high-dimensional) input data are fully observed. However, in several practical applications (such as hyper-spectral imaging or blood glucose monitoring), only an incomplete fraction of the data entries may be available. For incomplete settings, no provably correct and polynomial-time algorithm has been reported in the dictionary learning literature. In this paper, we provide provable approaches for learning - from incomplete samples - a family of dictionaries whose atoms have sufficiently “spread-out” mass. First, we propose a descent-style iterative algorithm that linearly converges to the true dictionary when provided a sufficiently coarse initial estimate. Second, we propose an initialization algorithm that utilizes a small number of extra fully observed samples to produce such a coarse initial estimate. Finally, we theoretically analyze their performance and provide asymptotic statistical and computational guarantees. |
Tasks | Dictionary Learning |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.09217v1 |
http://arxiv.org/pdf/1804.09217v1.pdf | |
PWC | https://paperswithcode.com/paper/on-learning-sparsely-used-dictionaries-from |
Repo | |
Framework | |
Multiple Combined Constraints for Image Stitching
Title | Multiple Combined Constraints for Image Stitching |
Authors | Kai Chen, Jingmin Tu, Binbin Xiang, Li Li, Jian Yao |
Abstract | Several approaches to image stitching use different constraints to estimate the motion model between image pairs. These constraints can be roughly divided into two categories: geometric constraints and photometric constraints. In this paper, geometric and photometric constraints are combined to improve the alignment quality, which is based on the observation that these two kinds of constraints are complementary. On the one hand, geometric constraints (e.g., point and line correspondences) are usually spatially biased and are insufficient in some extreme scenes, while photometric constraints are always evenly and densely distributed. On the other hand, photometric constraints are sensitive to displacements and are not suitable for images with large parallaxes, while geometric constraints are usually imposed by feature matching and are more robust to handle parallaxes. The proposed method therefore combines them together in an efficient mesh-based image warping framework. It achieves better alignment quality than methods only with geometric constraints, and can handle larger parallax than photometric-constraint-based method. Experimental results on various images illustrate that the proposed method outperforms representative state-of-the-art image stitching methods reported in the literature. |
Tasks | Image Stitching |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06706v1 |
http://arxiv.org/pdf/1809.06706v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-combined-constraints-for-image |
Repo | |
Framework | |