October 17, 2019

3047 words 15 mins read

Paper Group ANR 917

Double Quantization for Communication-Efficient Distributed Optimization. Geodesic Convex Optimization: Differentiation on Manifolds, Geodesics, and Convexity. Effects of Image Degradations to CNN-based Image Classification. A Survey on Natural Language Processing for Fake News Detection. Learning Multiple Defaults for Machine Learning Algorithms. …

Double Quantization for Communication-Efficient Distributed Optimization


Title	Double Quantization for Communication-Efficient Distributed Optimization
Authors	Yue Yu, Jiaxiang Wu, Longbo Huang
Abstract	Modern distributed training of machine learning models suffers from high communication overhead for synchronizing stochastic gradients and model parameters. In this paper, to reduce the communication complexity, we propose \emph{double quantization}, a general scheme for quantizing both model parameters and gradients. Three communication-efficient algorithms are proposed under this general scheme. Specifically, (i) we propose a low-precision algorithm AsyLPG with asynchronous parallelism, (ii) we explore integrating gradient sparsification with double quantization and develop Sparse-AsyLPG, (iii) we show that double quantization can also be accelerated by momentum technique and design accelerated AsyLPG. We establish rigorous performance guarantees for the algorithms, and conduct experiments on a multi-server test-bed to demonstrate that our algorithms can effectively save transmitted bits without performance degradation.
Tasks	Distributed Optimization, Quantization
Published	2018-05-25
URL	https://arxiv.org/abs/1805.10111v4
PDF	https://arxiv.org/pdf/1805.10111v4.pdf
PWC	https://paperswithcode.com/paper/double-quantization-for-communication
Repo
Framework

Geodesic Convex Optimization: Differentiation on Manifolds, Geodesics, and Convexity


Title	Geodesic Convex Optimization: Differentiation on Manifolds, Geodesics, and Convexity
Authors	Nisheeth K. Vishnoi
Abstract	Convex optimization is a vibrant and successful area due to the existence of a variety of efficient algorithms that leverage the rich structure provided by convexity. Convexity of a smooth set or a function in a Euclidean space is defined by how it interacts with the standard differential structure in this space – the Hessian of a convex function has to be positive semi-definite everywhere. However, in recent years, there is a growing demand to understand non-convexity and develop computational methods to optimize non-convex functions. Intriguingly, there is a type of non-convexity that disappears once one introduces a suitable differentiable structure and redefines convexity with respect to the straight lines, or {\em geodesics}, with respect to this structure. Such convexity is referred to as {\em geodesic convexity}. Interest in studying it arises due to recent reformulations of some non-convex problems as geodesically convex optimization problems over geodesically convex sets. Geodesics on manifolds have been extensively studied in various branches of Mathematics and Physics. However, unlike convex optimization, understanding geodesics and geodesic convexity from a computational point of view largely remains a mystery. The goal of this exposition is to introduce the first part of geodesic convex optimization – geodesic convexity – in a self-contained manner. We first present a variety of notions from differential and Riemannian geometry such as differentiation on manifolds, geodesics, and then introduce geodesic convexity. We conclude by showing that certain non-convex optimization problems such as computing the Brascamp-Lieb constant and the operator scaling problem have geodesically convex formulations.
Tasks
Published	2018-06-17
URL	http://arxiv.org/abs/1806.06373v1
PDF	http://arxiv.org/pdf/1806.06373v1.pdf
PWC	https://paperswithcode.com/paper/geodesic-convex-optimization-differentiation
Repo
Framework

Effects of Image Degradations to CNN-based Image Classification


Title	Effects of Image Degradations to CNN-based Image Classification
Authors	Yanting Pei, Yaping Huang, Qi Zou, Hao Zang, Xingyuan Zhang, Song Wang
Abstract	Just like many other topics in computer vision, image classification has achieved significant progress recently by using deep-learning neural networks, especially the Convolutional Neural Networks (CNN). Most of the existing works are focused on classifying very clear natural images, evidenced by the widely used image databases such as Caltech-256, PASCAL VOCs and ImageNet. However, in many real applications, the acquired images may contain certain degradations that lead to various kinds of blurring, noise, and distortions. One important and interesting problem is the effect of such degradations to the performance of CNN-based image classification. More specifically, we wonder whether image-classification performance drops with each kind of degradation, whether this drop can be avoided by including degraded images into training, and whether existing computer vision algorithms that attempt to remove such degradations can help improve the image-classification performance. In this paper, we empirically study this problem for four kinds of degraded images – hazy images, underwater images, motion-blurred images and fish-eye images. For this study, we synthesize a large number of such degraded images by applying respective physical models to the clear natural images and collect a new hazy image dataset from the Internet. We expect this work can draw more interests from the community to study the classification of degraded images.
Tasks	Image Classification
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05552v1
PDF	http://arxiv.org/pdf/1810.05552v1.pdf
PWC	https://paperswithcode.com/paper/effects-of-image-degradations-to-cnn-based
Repo
Framework

A Survey on Natural Language Processing for Fake News Detection


Title	A Survey on Natural Language Processing for Fake News Detection
Authors	Ray Oshikawa, Jing Qian, William Yang Wang
Abstract	Fake news detection is a critical yet challenging problem in Natural Language Processing (NLP). The rapid rise of social networking platforms has not only yielded a vast increase in information accessibility but has also accelerated the spread of fake news. Thus, the effect of fake news has been growing, sometimes extending to the offline world and threatening public safety. Given the massive amount of Web content, automatic fake news detection is a practical NLP problem useful to all online content providers, in order to reduce the human time and effort to detect and prevent the spread of fake news. In this paper, we describe the challenges involved in fake news detection and also describe related tasks. We systematically review and compare the task formulations, datasets and NLP solutions that have been developed for this task, and also discuss the potentials and limitations of them. Based on our insights, we outline promising research directions, including more fine-grained, detailed, fair, and practical detection models. We also highlight the difference between fake news detection and other related tasks, and the importance of NLP solutions for fake news detection.
Tasks	Fake News Detection
Published	2018-11-02
URL	https://arxiv.org/abs/1811.00770v2
PDF	https://arxiv.org/pdf/1811.00770v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-natural-language-processing-for
Repo
Framework

Learning Multiple Defaults for Machine Learning Algorithms


Title	Learning Multiple Defaults for Machine Learning Algorithms
Authors	Florian Pfisterer, Jan N. van Rijn, Philipp Probst, Andreas Müller, Bernd Bischl
Abstract	The performance of modern machine learning methods highly depends on their hyperparameter configurations. One simple way of selecting a configuration is to use default settings, often proposed along with the publication and implementation of a new algorithm. Those default values are usually chosen in an ad-hoc manner to work good enough on a wide variety of datasets. To address this problem, different automatic hyperparameter configuration algorithms have been proposed, which select an optimal configuration per dataset. This principled approach usually improves performance, but adds additional algorithmic complexity and computational costs to the training procedure. As an alternative to this, we propose learning a set of complementary default values from a large database of prior empirical results. Selecting an appropriate configuration on a new dataset then requires only a simple, efficient and embarrassingly parallel search over this set. We demonstrate the effectiveness and efficiency of the approach we propose in comparison to random search and Bayesian Optimization.
Tasks
Published	2018-11-23
URL	http://arxiv.org/abs/1811.09409v1
PDF	http://arxiv.org/pdf/1811.09409v1.pdf
PWC	https://paperswithcode.com/paper/learning-multiple-defaults-for-machine
Repo
Framework

Cross-domain Recommendation via Deep Domain Adaptation


Title	Cross-domain Recommendation via Deep Domain Adaptation
Authors	Heishiro Kanagawa, Hayato Kobayashi, Nobuyuki Shimizu, Yukihiro Tagami, Taiji Suzuki
Abstract	The behavior of users in certain services could be a clue that can be used to infer their preferences and may be used to make recommendations for other services they have never used. However, the cross-domain relationships between items and user consumption patterns are not simple, especially when there are few or no common users and items across domains. To address this problem, we propose a content-based cross-domain recommendation method for cold-start users that does not require user- and item- overlap. We formulate recommendation as extreme multi-class classification where labels (items) corresponding to the users are predicted. With this formulation, the problem is reduced to a domain adaptation setting, in which a classifier trained in the source domain is adapted to the target domain. For this, we construct a neural network that combines an architecture for domain adaptation, Domain Separation Network, with a denoising autoencoder for item representation. We assess the performance of our approach in experiments on a pair of data sets collected from movie and news services of Yahoo! JAPAN and show that our approach outperforms several baseline methods including a cross-domain collaborative filtering method.
Tasks	Denoising, Domain Adaptation
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03018v1
PDF	http://arxiv.org/pdf/1803.03018v1.pdf
PWC	https://paperswithcode.com/paper/cross-domain-recommendation-via-deep-domain
Repo
Framework

Vanlearning: A Machine Learning SaaS Application for People Without Programming Backgrounds


Title	Vanlearning: A Machine Learning SaaS Application for People Without Programming Backgrounds
Authors	Chaochen Wu
Abstract	Although we have tons of machine learning tools to analyze data, most of them require users have some programming backgrounds. Here we introduce a SaaS application which allows users analyze their data without any coding and even without any knowledge of machine learning. Users can upload, train, predict and download their data by simply clicks their mouses. Our system uses data pre-processor and validator to relieve the computational cost of our server. The simple architecture of Vanlearning helps developers can easily maintain and extend it.
Tasks
Published	2018-04-03
URL	http://arxiv.org/abs/1804.01382v1
PDF	http://arxiv.org/pdf/1804.01382v1.pdf
PWC	https://paperswithcode.com/paper/vanlearning-a-machine-learning-saas
Repo
Framework

Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos


Title	Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos
Authors	Hsien-Tzu Cheng, Chun-Hung Chao, Jin-Dong Dong, Hao-Kai Wen, Tyng-Luh Liu, Min Sun
Abstract	Automatic saliency prediction in 360{\deg} videos is critical for viewpoint guidance applications (e.g., Facebook 360 Guide). We propose a spatial-temporal network which is (1) weakly-supervised trained and (2) tailor-made for 360{\deg} viewing sphere. Note that most existing methods are less scalable since they rely on annotated saliency map for training. Most importantly, they convert 360{\deg} sphere to 2D images (e.g., a single equirectangular image or multiple separate Normal Field-of-View (NFoV) images) which introduces distortion and image boundaries. In contrast, we propose a simple and effective Cube Padding (CP) technique as follows. Firstly, we render the 360{\deg} view on six faces of a cube using perspective projection. Thus, it introduces very little distortion. Then, we concatenate all six faces while utilizing the connectivity between faces on the cube for image padding (i.e., Cube Padding) in convolution, pooling, convolutional LSTM layers. In this way, CP introduces no image boundary while being applicable to almost all Convolutional Neural Network (CNN) structures. To evaluate our method, we propose Wild-360, a new 360{\deg} video saliency dataset, containing challenging videos with saliency heatmap annotations. In experiments, our method outperforms baseline methods in both speed and quality.
Tasks	Saliency Prediction
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01320v1
PDF	http://arxiv.org/pdf/1806.01320v1.pdf
PWC	https://paperswithcode.com/paper/cube-padding-for-weakly-supervised-saliency
Repo
Framework

Incorporating Relevant Knowledge in Context Modeling and Response Generation


Title	Incorporating Relevant Knowledge in Context Modeling and Response Generation
Authors	Yanran Li, Wenjie Li, Ziqiang Cao, Chengyao Chen
Abstract	To sustain engaging conversation, it is critical for chatbots to make good use of relevant knowledge. Equipped with a knowledge base, chatbots are able to extract conversation-related attributes and entities to facilitate context modeling and response generation. In this work, we distinguish the uses of attribute and entity and incorporate them into the encoder-decoder architecture in different manners. Based on the augmented architecture, our chatbot, namely Mike, is able to generate responses by referring to proper entities from the collected knowledge. To validate the proposed approach, we build a movie conversation corpus on which the proposed approach significantly outperforms other four knowledge-grounded models.
Tasks	Chatbot
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03729v1
PDF	http://arxiv.org/pdf/1811.03729v1.pdf
PWC	https://paperswithcode.com/paper/incorporating-relevant-knowledge-in-context
Repo
Framework

How Important Is a Neuron?


Title	How Important Is a Neuron?
Authors	Kedar Dhamdhere, Mukund Sundararajan, Qiqi Yan
Abstract	The problem of attributing a deep network’s prediction to its \emph{input/base} features is well-studied. We introduce the notion of \emph{conductance} to extend the notion of attribution to the understanding the importance of \emph{hidden} units. Informally, the conductance of a hidden unit of a deep network is the \emph{flow} of attribution via this hidden unit. We use conductance to understand the importance of a hidden unit to the prediction for a specific input, or over a set of inputs. We evaluate the effectiveness of conductance in multiple ways, including theoretical properties, ablation studies, and a feature selection task. The empirical evaluations are done using the Inception network over ImageNet data, and a sentiment analysis network over reviews. In both cases, we demonstrate the effectiveness of conductance in identifying interesting insights about the internal workings of these networks.
Tasks	Feature Selection, Sentiment Analysis
Published	2018-05-30
URL	http://arxiv.org/abs/1805.12233v1
PDF	http://arxiv.org/pdf/1805.12233v1.pdf
PWC	https://paperswithcode.com/paper/how-important-is-a-neuron
Repo
Framework

Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks


Title	Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks
Authors	Vishal Kaushal, Anurag Sahoo, Khoshrav Doctor, Narasimha Raju, Suyash Shetty, Pankaj Singh, Rishabh Iyer, Ganesh Ramakrishnan
Abstract	Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts. Training data subset selection and active learning techniques have been proposed as possible solutions to these challenges respectively. A special class of subset selection functions naturally model notions of diversity, coverage and representation and they can be used to eliminate redundancy and thus lend themselves well for training data subset selection. They can also help improve the efficiency of active learning in further reducing human labeling efforts by selecting a subset of the examples obtained using the conventional uncertainty sampling based techniques. In this work we empirically demonstrate the effectiveness of two diversity models, namely the Facility-Location and Disparity-Min models for training-data subset selection and reducing labeling effort. We do this for a variety of computer vision tasks including Gender Recognition, Scene Recognition and Object Recognition. Our results show that subset selection done in the right way can add 2-3% in accuracy on existing baselines, particularly in the case of less training data. This allows the training of complex machine learning models (like Convolutional Neural Networks) with much less training data while incurring minimal performance loss.
Tasks	Active Learning, Image Classification, Object Recognition, Scene Recognition
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11191v1
PDF	http://arxiv.org/pdf/1805.11191v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-less-data-diversified-subset
Repo
Framework

Estimation from Non-Linear Observations via Convex Programming with Application to Bilinear Regression


Title	Estimation from Non-Linear Observations via Convex Programming with Application to Bilinear Regression
Authors	Sohail Bahmani
Abstract	We propose a computationally efficient estimator, formulated as a convex program, for a broad class of non-linear regression problems that involve difference of convex (DC) non-linearities. The proposed method can be viewed as a significant extension of the “anchored regression” method formulated and analyzed in [10] for regression with convex non-linearities. Our main assumption, in addition to other mild statistical and computational assumptions, is availability of a certain approximation oracle for the average of the gradients of the observation functions at a ground truth. Under this assumption and using a PAC-Bayesian analysis we show that the proposed estimator produces an accurate estimate with high probability. As a concrete example, we study the proposed framework in the bilinear regression problem with Gaussian factors and quantify a sufficient sample complexity for exact recovery. Furthermore, we describe a computationally tractable scheme that provably produces the required approximation oracle in the considered bilinear regression problem.
Tasks
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07307v2
PDF	http://arxiv.org/pdf/1806.07307v2.pdf
PWC	https://paperswithcode.com/paper/estimation-from-non-linear-observations-via
Repo
Framework

Machine Learning Promoting Extreme Simplification of Spectroscopy Equipment


Title	Machine Learning Promoting Extreme Simplification of Spectroscopy Equipment
Authors	Jianchao Lee, Qiannan Duan, Sifan Bi, Ruen Luo, Yachao Lian, Hanqiang Liu, Ruixing Tian, Jiayuan Chen, Guodong Ma, Jinhong Gao, Zhaoyi Xu
Abstract	The spectroscopy measurement is one of main pathways for exploring and understanding the nature. Today, it seems that racing artificial intelligence will remould its styles. The algorithms contained in huge neural networks are capable of substituting many of expensive and complex components of spectrum instruments. In this work, we presented a smart machine learning strategy on the measurement of absorbance curves, and also initially verified that an exceedingly-simplified equipment is sufficient to meet the needs for this strategy. Further, with its simplicity, the setup is expected to infiltrate into many scientific areas in versatile forms.
Tasks
Published	2018-08-06
URL	https://arxiv.org/abs/1808.03679v2
PDF	https://arxiv.org/pdf/1808.03679v2.pdf
PWC	https://paperswithcode.com/paper/machine-learning-promoting-extreme
Repo
Framework

Deep Attentional Structured Representation Learning for Visual Recognition


Title	Deep Attentional Structured Representation Learning for Visual Recognition
Authors	Krishna Kanth Nakka, Mathieu Salzmann
Abstract	Structured representations, such as Bags of Words, VLAD and Fisher Vectors, have proven highly effective to tackle complex visual recognition tasks. As such, they have recently been incorporated into deep architectures. However, while effective, the resulting deep structured representation learning strategies typically aggregate local features from the entire image, ignoring the fact that, in complex recognition tasks, some regions provide much more discriminative information than others. In this paper, we introduce an attentional structured representation learning framework that incorporates an image-specific attention mechanism within the aggregation process. Our framework learns to predict jointly the image class label and an attention map in an end-to-end fashion and without any other supervision than the target label. As evidenced by our experiments, this consistently outperforms attention-less structured representation learning and yields state-of-the-art results on standard scene recognition and fine-grained categorization benchmarks.
Tasks	Representation Learning, Scene Recognition
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05389v1
PDF	http://arxiv.org/pdf/1805.05389v1.pdf
PWC	https://paperswithcode.com/paper/deep-attentional-structured-representation
Repo
Framework

An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public Streets


Title	An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public Streets
Authors	Mansur Arief, Peter Glynn, Ding Zhao
Abstract	Various automobile and mobility companies, for instance Ford, Uber and Waymo, are currently testing their pre-produced autonomous vehicle (AV) fleets on the public roads. However, due to rareness of the safety-critical cases and, effectively, unlimited number of possible traffic scenarios, these on-road testing efforts have been acknowledged as tedious, costly, and risky. In this study, we propose Accelerated De- ployment framework to safely and efficiently estimate the AVs performance on public streets. We showed that by appropriately addressing the gradual accuracy improvement and adaptively selecting meaningful and safe environment under which the AV is deployed, the proposed framework yield to highly accurate estimation with much faster evaluation time, and more importantly, lower deployment risk. Our findings provide an answer to the currently heated and active discussions on how to properly test AV performance on public roads so as to achieve safe, efficient, and statistically-reliable testing framework for AV technologies.
Tasks	Autonomous Vehicles
Published	2018-05-05
URL	http://arxiv.org/abs/1805.02114v2
PDF	http://arxiv.org/pdf/1805.02114v2.pdf
PWC	https://paperswithcode.com/paper/an-accelerated-approach-to-safely-and
Repo
Framework