October 20, 2019

3280 words 16 mins read

Paper Group ANR 91

The Influence of Down-Sampling Strategies on SVD Word Embedding Stability. Scale-invariant Feature Extraction of Neural Network and Renormalization Group Flow. An analysis of training and generalization errors in shallow and deep networks. Deep Reinforcement Learning to Acquire Navigation Skills for Wheel-Legged Robots in Complex Environments. Inte …

The Influence of Down-Sampling Strategies on SVD Word Embedding Stability


Title	The Influence of Down-Sampling Strategies on SVD Word Embedding Stability
Authors	Johannes Hellrich, Bernd Kampe, Udo Hahn
Abstract	The stability of word embedding algorithms, i.e., the consistency of the word representations they reveal when trained repeatedly on the same data set, has recently raised concerns. We here compare word embedding algorithms on three corpora of different sizes, and evaluate both their stability and accuracy. We find strong evidence that down-sampling strategies (used as part of their training procedures) are particularly influential for the stability of SVDPPMI-type embeddings. This finding seems to explain diverging reports on their stability and lead us to a simple modification which provides superior stability as well as accuracy on par with skip-gram embeddings.
Tasks	Word Embeddings
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06810v2
PDF	http://arxiv.org/pdf/1808.06810v2.pdf
PWC	https://paperswithcode.com/paper/downsampling-strategies-are-crucial-for-word
Repo
Framework

Scale-invariant Feature Extraction of Neural Network and Renormalization Group Flow


Title	Scale-invariant Feature Extraction of Neural Network and Renormalization Group Flow
Authors	Satoshi Iso, Shotaro Shiba, Sumito Yokoo
Abstract	Theoretical understanding of how deep neural network (DNN) extracts features from input images is still unclear, but it is widely believed that the extraction is performed hierarchically through a process of coarse-graining. It reminds us of the basic concept of renormalization group (RG) in statistical physics. In order to explore possible relations between DNN and RG, we use the Restricted Boltzmann machine (RBM) applied to Ising model and construct a flow of model parameters (in particular, temperature) generated by the RBM. We show that the unsupervised RBM trained by spin configurations at various temperatures from $T=0$ to $T=6$ generates a flow along which the temperature approaches the critical value $T_c=2.27$. This behavior is opposite to the typical RG flow of the Ising model. By analyzing various properties of the weight matrices of the trained RBM, we discuss why it flows towards $T_c$ and how the RBM learns to extract features of spin configurations.
Tasks
Published	2018-01-22
URL	http://arxiv.org/abs/1801.07172v1
PDF	http://arxiv.org/pdf/1801.07172v1.pdf
PWC	https://paperswithcode.com/paper/scale-invariant-feature-extraction-of-neural
Repo
Framework

An analysis of training and generalization errors in shallow and deep networks


Title	An analysis of training and generalization errors in shallow and deep networks
Authors	Hrushikesh Mhaskar, Tomaso Poggio
Abstract	This paper is motivated by an open problem around deep networks, namely, the apparent absence of over-fitting despite large over-parametrization which allows perfect fitting of the training data. In this paper, we analyze this phenomenon in the case of regression problems when each unit evaluates a periodic activation function. We argue that the minimal expected value of the square loss is inappropriate to measure the generalization error in approximation of compositional functions in order to take full advantage of the compositional structure. Instead, we measure the generalization error in the sense of maximum loss, and sometimes, as a pointwise error. We give estimates on exactly how many parameters ensure both zero training error as well as a good generalization error. We prove that a solution of a regularization problem is guaranteed to yield a good training error as well as a good generalization error and estimate how much error to expect at which test data.
Tasks
Published	2018-02-17
URL	https://arxiv.org/abs/1802.06266v4
PDF	https://arxiv.org/pdf/1802.06266v4.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-training-and-generalization
Repo
Framework


Title	Deep Reinforcement Learning to Acquire Navigation Skills for Wheel-Legged Robots in Complex Environments
Authors	Xi Chen, Ali Ghadirzadeh, John Folkesson, Patric Jensfelt
Abstract	Mobile robot navigation in complex and dynamic environments is a challenging but important problem. Reinforcement learning approaches fail to solve these tasks efficiently due to reward sparsities, temporal complexities and high-dimensionality of sensorimotor spaces which are inherent in such problems. We present a novel approach to train action policies to acquire navigation skills for wheel-legged robots using deep reinforcement learning. The policy maps height-map image observations to motor commands to navigate to a target position while avoiding obstacles. We propose to acquire the multifaceted navigation skill by learning and exploiting a number of manageable navigation behaviors. We also introduce a domain randomization technique to improve the versatility of the training samples. We demonstrate experimentally a significant improvement in terms of data-efficiency, success rate, robustness against irrelevant sensory data, and also the quality of the maneuver skills.
Tasks	Legged Robots, Robot Navigation
Published	2018-04-27
URL	http://arxiv.org/abs/1804.10500v1
PDF	http://arxiv.org/pdf/1804.10500v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-to-acquire
Repo
Framework

Interpreting Deep Learning: The Machine Learning Rorschach Test?


Title	Interpreting Deep Learning: The Machine Learning Rorschach Test?
Authors	Adam S. Charles
Abstract	Theoretical understanding of deep learning is one of the most important tasks facing the statistics and machine learning communities. While deep neural networks (DNNs) originated as engineering methods and models of biological networks in neuroscience and psychology, they have quickly become a centerpiece of the machine learning toolbox. Unfortunately, DNN adoption powered by recent successes combined with the open-source nature of the machine learning community, has outpaced our theoretical understanding. We cannot reliably identify when and why DNNs will make mistakes. In some applications like text translation these mistakes may be comical and provide for fun fodder in research talks, a single error can be very costly in tasks like medical imaging. As we utilize DNNs in increasingly sensitive applications, a better understanding of their properties is thus imperative. Recent advances in DNN theory are numerous and include many different sources of intuition, such as learning theory, sparse signal analysis, physics, chemistry, and psychology. An interesting pattern begins to emerge in the breadth of possible interpretations. The seemingly limitless approaches are mostly constrained by the lens with which the mathematical operations are viewed. Ultimately, the interpretation of DNNs appears to mimic a type of Rorschach test — a psychological test wherein subjects interpret a series of seemingly ambiguous ink-blots. Validation for DNN theory requires a convergence of the literature. We must distinguish between universal results that are invariant to the analysis perspective and those that are specific to a particular network configuration. Simultaneously we must deal with the fact that many standard statistical tools for quantifying generalization or empirically assessing important network features are difficult to apply to DNNs.
Tasks
Published	2018-06-01
URL	http://arxiv.org/abs/1806.00148v1
PDF	http://arxiv.org/pdf/1806.00148v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-deep-learning-the-machine
Repo
Framework

Fully automated primary particle size analysis of agglomerates on transmission electron microscopy images via artificial neural networks


Title	Fully automated primary particle size analysis of agglomerates on transmission electron microscopy images via artificial neural networks
Authors	Max Frei, Frank Einar Kruis
Abstract	There is a high demand for fully automated methods for the analysis of primary particle size distributions of agglomerates on transmission electron microscopy images. Therefore, a novel method, based on the utilization of artificial neural networks, was proposed, implemented and validated. The training of the artificial neural networks requires large quantities (up to several hundreds of thousands) of transmission electron microscopy images of agglomerates consisting of primary particles with known sizes. Since the manual evaluation of such large amounts of transmission electron microscopy images is not feasible, a synthesis of lifelike transmission electron microscopy images as training data was implemented. The proposed method can compete with state-of-the-art automated imaging particle size methods like the Hough transformation, ultimate erosion and watershed transformation and is in some cases even able to outperform these methods. It is however still outperformed by the manual analysis.
Tasks
Published	2018-06-08
URL	http://arxiv.org/abs/1806.04010v1
PDF	http://arxiv.org/pdf/1806.04010v1.pdf
PWC	https://paperswithcode.com/paper/fully-automated-primary-particle-size
Repo
Framework

A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks


Title	A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Authors	Boyu Zhang, Azadeh Davoodi, Yu-Hen Hu
Abstract	The ability to customize a trained Deep Neural Network (DNN) locally using user-specific data may greatly enhance user experiences, reduce development costs, and protect user’s privacy. In this work, we propose to incorporate a novel Mixture of Experts (MOE) approach to accomplish this goal. This architecture comprises of a Global Expert (GE), a Local Expert (LE) and a Gating Network (GN). The GE is a trained DNN developed on a large training dataset representative of many potential users. After deployment on an embedded edge device, GE will be subject to customized, user-specific data (e.g., accent in speech) and its performance may suffer. This problem may be alleviated by training a local DNN (the local expert, LE) on a small size customized training data to correct the errors made by GE. A gating network then will be trained to determine whether an incoming data should be handled by GE or LE. Since the customized dataset is in general very small, the cost of training LE and GN would be much lower than that of re-training of GE. The training of LE and GN thus can be performed at local device, properly protecting the privacy of customized training data. In this work, we developed a prototype MOE architecture for handwritten alphanumeric character recognition task. We use EMNIST as the generic dataset, LeNet5 as GE, and handwritings of 10 users as the customized dataset. We show that with the LE and GN, the classification accuracy is significantly enhanced over the customized dataset with almost no degradation of accuracy over the generic dataset. In terms of energy and network size, the overhead of LE and GN is around 2.5% compared to those of GE.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1811.00056v1
PDF	http://arxiv.org/pdf/1811.00056v1.pdf
PWC	https://paperswithcode.com/paper/a-mixture-of-expert-approach-for-low-cost
Repo
Framework

Auto-Context R-CNN


Title	Auto-Context R-CNN
Authors	Bo Li, Tianfu Wu, Lun Zhang, Rufeng Chu
Abstract	Region-based convolutional neural networks (R-CNN)~\cite{fast_rcnn,faster_rcnn,mask_rcnn} have largely dominated object detection. Operators defined on RoIs (Region of Interests) play an important role in R-CNNs such as RoIPooling~\cite{fast_rcnn} and RoIAlign~\cite{mask_rcnn}. They all only utilize information inside RoIs for RoI prediction, even with their recent deformable extensions~\cite{deformable_cnn}. Although surrounding context is well-known for its importance in object detection, it has yet been integrated in R-CNNs in a flexible and effective way. Inspired by the auto-context work~\cite{auto_context} and the multi-class object layout work~\cite{nms_context}, this paper presents a generic context-mining RoI operator (i.e., \textit{RoICtxMining}) seamlessly integrated in R-CNNs, and the resulting object detection system is termed \textbf{Auto-Context R-CNN} which is trained end-to-end. The proposed RoICtxMining operator is a simple yet effective two-layer extension of the RoIPooling or RoIAlign operator. Centered at an object-RoI, it creates a $3\times 3$ layout to mine contextual information adaptively in the $8$ surrounding context regions on-the-fly. Within each of the $8$ context regions, a context-RoI is mined in term of discriminative power and its RoIPooling / RoIAlign features are concatenated with the object-RoI for final prediction. \textit{The proposed Auto-Context R-CNN is robust to occlusion and small objects, and shows promising vulnerability for adversarial attacks without being adversarially-trained.} In experiments, it is evaluated using RoIPooling as the backbone and shows competitive results on Pascal VOC, Microsoft COCO, and KITTI datasets (including $6.9%$ mAP improvements over the R-FCN~\cite{rfcn} method on COCO \textit{test-dev} dataset and the first place on both KITTI pedestrian and cyclist detection as of this submission).
Tasks	Object Detection
Published	2018-07-08
URL	http://arxiv.org/abs/1807.02842v1
PDF	http://arxiv.org/pdf/1807.02842v1.pdf
PWC	https://paperswithcode.com/paper/auto-context-r-cnn
Repo
Framework

Finite Horizon Throughput Maximization and Sensing Optimization in Wireless Powered Devices over Fading Channels


Title	Finite Horizon Throughput Maximization and Sensing Optimization in Wireless Powered Devices over Fading Channels
Authors	Mehdi Salehi Heydar Abad, Ozgur Ercetin
Abstract	Wireless power transfer (WPT) is a promising technology that provides the network a way to replenish the batteries of the remote devices by utilizing RF transmissions. We study a class of harvest-first-transmit-later type of WPT policy, where an access point (AP) first employs RF power transfer to recharge a wireless powered device (WPD) for a certain period subjected to optimization, and then, the harvested energy is subsequently used by the WPD to transmit its data bits back to the AP over a finite horizon. A significant challenge regarding the studied WPT scenario is the time-varying nature of the wireless channel linking the WPD to the AP. We first investigate as a benchmark the offline case where the channel realizations are known non-causally prior to the starting of the horizon. For the offline case, by finding the optimal WPT duration and power allocations in the data transmission period, we derive an upper bound on the throughput of the WPD. We then focus on the online counterpart of the problem where the channel realizations are known causally. We prove that the optimal WPT duration obeys a time-dependent threshold form depending on the energy state of the WPD. In the subsequent data transmission stage, the optimal transmit power allocation for the WPD is shown to be of a fractional structure where at each time slot a fraction of energy depending on the current channel and a measure of future channel state expectations is allocated for data transmission. We numerically show that the online policy performs almost identical to the upper bound. We then consider a data sensing application, where the WPD adjusts the sensing resolution to balance between the quality of the sensed data and the probability of successfully delivering it. We use Bayesian inference as a reinforcement learning method to provide a mean for the WPD in learning to balance the sensing resolution.
Tasks	Bayesian Inference
Published	2018-03-17
URL	http://arxiv.org/abs/1804.01834v2
PDF	http://arxiv.org/pdf/1804.01834v2.pdf
PWC	https://paperswithcode.com/paper/finite-horizon-throughput-maximization-and
Repo
Framework

Stable Tensor Neural Networks for Rapid Deep Learning


Title	Stable Tensor Neural Networks for Rapid Deep Learning
Authors	Elizabeth Newman, Lior Horesh, Haim Avron, Misha Kilmer
Abstract	We propose a tensor neural network ($t$-NN) framework that offers an exciting new paradigm for designing neural networks with multidimensional (tensor) data. Our network architecture is based on the $t$-product (Kilmer and Martin, 2011), an algebraic formulation to multiply tensors via circulant convolution. In this $t$-product algebra, we interpret tensors as $t$-linear operators analogous to matrices as linear operators, and hence our framework inherits mimetic matrix properties. To exemplify the elegant, matrix-mimetic algebraic structure of our $t$-NNs, we expand on recent work (Haber and Ruthotto, 2017) which interprets deep neural networks as discretizations of non-linear differential equations and introduces stable neural networks which promote superior generalization. Motivated by this dynamic framework, we introduce a stable $t$-NN which facilitates more rapid learning because of its reduced, more powerful parameterization. Through our high-dimensional design, we create a more compact parameter space and extract multidimensional correlations otherwise latent in traditional algorithms. We further generalize our $t$-NN framework to a family of tensor-tensor products (Kernfeld, Kilmer, and Aeron, 2015) which still induce a matrix-mimetic algebraic structure. Through numerical experiments on the MNIST and CIFAR-10 datasets, we demonstrate the more powerful parameterizations and improved generalizability of stable $t$-NNs.
Tasks
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06569v1
PDF	http://arxiv.org/pdf/1811.06569v1.pdf
PWC	https://paperswithcode.com/paper/stable-tensor-neural-networks-for-rapid-deep
Repo
Framework

Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents


Title	Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents
Authors	Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas
Abstract	Fast expansion of natural language functionality of intelligent virtual agents is critical for achieving engaging and informative interactions. However, developing accurate models for new natural language domains is a time and data intensive process. We propose efficient deep neural network architectures that maximally re-use available resources through transfer learning. Our methods are applied for expanding the understanding capabilities of a popular commercial agent and are evaluated on hundreds of new domains, designed by internal or external developers. We demonstrate that our proposed methods significantly increase accuracy in low resource settings and enable rapid development of accurate models with less data.
Tasks	Transfer Learning
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01542v1
PDF	http://arxiv.org/pdf/1805.01542v1.pdf
PWC	https://paperswithcode.com/paper/fast-and-scalable-expansion-of-natural
Repo
Framework

Person Re-identification with Deep Similarity-Guided Graph Neural Network


Title	Person Re-identification with Deep Similarity-Guided Graph Neural Network
Authors	Yantao Shen, Hongsheng Li, Shuai Yi, Dapeng Chen, Xiaogang Wang
Abstract	The person re-identification task requires to robustly estimate visual similarities between person images. However, existing person re-identification models mostly estimate the similarities of different image pairs of probe and gallery images independently while ignores the relationship information between different probe-gallery pairs. As a result, the similarity estimation of some hard samples might not be accurate. In this paper, we propose a novel deep learning framework, named Similarity-Guided Graph Neural Network (SGGNN) to overcome such limitations. Given a probe image and several gallery images, SGGNN creates a graph to represent the pairwise relationships between probe-gallery pairs (nodes) and utilizes such relationships to update the probe-gallery relation features in an end-to-end manner. Accurate similarity estimation can be achieved by using such updated probe-gallery relation features for prediction. The input features for nodes on the graph are the relation features of different probe-gallery image pairs. The probe-gallery relation feature updating is then performed by the messages passing in SGGNN, which takes other nodes’ information into account for similarity estimation. Different from conventional GNN approaches, SGGNN learns the edge weights with rich labels of gallery instance pairs directly, which provides relation fusion more precise information. The effectiveness of our proposed method is validated on three public person re-identification datasets.
Tasks	Person Re-Identification
Published	2018-07-26
URL	http://arxiv.org/abs/1807.09975v1
PDF	http://arxiv.org/pdf/1807.09975v1.pdf
PWC	https://paperswithcode.com/paper/person-re-identification-with-deep-similarity
Repo
Framework

Weakly Supervised Dense Event Captioning in Videos


Title	Weakly Supervised Dense Event Captioning in Videos
Authors	Xuguang Duan, Wenbing Huang, Chuang Gan, Jingdong Wang, Wenwu Zhu, Junzhou Huang
Abstract	Dense event captioning aims to detect and describe all events of interest contained in a video. Despite the advanced development in this area, existing methods tackle this task by making use of dense temporal annotations, which is dramatically source-consuming. This paper formulates a new problem: weakly supervised dense event captioning, which does not require temporal segment annotations for model training. Our solution is based on the one-to-one correspondence assumption, each caption describes one temporal segment, and each temporal segment has one caption, which holds in current benchmark datasets and most real-world cases. We decompose the problem into a pair of dual problems: event captioning and sentence localization and present a cycle system to train our model. Extensive experimental results are provided to demonstrate the ability of our model on both dense event captioning and sentence localization in videos.
Tasks
Published	2018-12-10
URL	http://arxiv.org/abs/1812.03849v1
PDF	http://arxiv.org/pdf/1812.03849v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-dense-event-captioning-in
Repo
Framework

Ricean K-factor Estimation based on Channel Quality Indicator in OFDM Systems using Neural Network


Title	Ricean K-factor Estimation based on Channel Quality Indicator in OFDM Systems using Neural Network
Authors	Kun Wang
Abstract	Ricean channel model is widely used in wireless communications to characterize the channels with a line-of-sight path. The Ricean K factor, defined as the ratio of direct path and scattered paths, provides a good indication of the link quality. Most existing works estimate K factor based on either maximum-likelihood criterion or higher-order moments, and the existing works are targeted at K-factor estimation at receiver side. In this work, a novel approach is proposed. Cast as a classification problem, the estimation of K factor by neural network provides high accuracy. Moreover, the proposed K-factor estimation is done at transmitter side for transmit processing, thus saving the limited feedback bandwidth.
Tasks
Published	2018-08-15
URL	http://arxiv.org/abs/1808.06537v1
PDF	http://arxiv.org/pdf/1808.06537v1.pdf
PWC	https://paperswithcode.com/paper/ricean-k-factor-estimation-based-on-channel
Repo
Framework

Corpus Statistics in Text Classification of Online Data


Title	Corpus Statistics in Text Classification of Online Data
Authors	Marina Sokolova, Victoria Bobicev
Abstract	Transformation of Machine Learning (ML) from a boutique science to a generally accepted technology has increased importance of reproduction and transportability of ML studies. In the current work, we investigate how corpus characteristics of textual data sets correspond to text classification results. We work with two data sets gathered from sub-forums of an online health-related forum. Our empirical results are obtained for a multi-class sentiment analysis application.
Tasks	Sentiment Analysis, Text Classification
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06390v1
PDF	http://arxiv.org/pdf/1803.06390v1.pdf
PWC	https://paperswithcode.com/paper/corpus-statistics-in-text-classification-of
Repo
Framework