Paper Group ANR 417
Streaming Network Embedding through Local Actions. VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry. Generalization and Expressivity for Deep Nets. Long-term Visual Localization using Semantically Segmented Images. Orthogonal Policy Gradient and Autonomous Driving Application. Robustness Meets Deep Learning: An End-t …
Streaming Network Embedding through Local Actions
Title | Streaming Network Embedding through Local Actions |
Authors | Xi Liu, Ping-Chun Hsieh, Nick Duffield, Rui Chen, Muhe Xie, Xidao Wen |
Abstract | Recently, considerable research attention has been paid to network embedding, a popular approach to construct feature vectors of vertices. Due to the curse of dimensionality and sparsity in graphical datasets, this approach has become indispensable for machine learning tasks over large networks. The majority of existing literature has considered this technique under the assumption that the network is static. However, networks in many applications, nodes and edges accrue to a growing network as a streaming. A small number of very recent results have addressed the problem of embedding for dynamic networks. However, they either rely on knowledge of vertex attributes, suffer high-time complexity or need to be re-trained without closed-form expression. Thus the approach of adapting the existing methods to the streaming environment faces non-trivial technical challenges. These challenges motivate developing new approaches to the problems of streaming network embedding. In this paper, We propose a new framework that is able to generate latent features for new vertices with high efficiency and low complexity under specified iteration rounds. We formulate a constrained optimization problem for the modification of the representation resulting from a stream arrival. We show this problem has no closed-form solution and instead develop an online approximation solution. Our solution follows three steps: (1) identify vertices affected by new vertices, (2) generate latent features for new vertices, and (3) update the latent features of the most affected vertices. The generated representations are provably feasible and not far from the optimal ones in terms of expectation. Multi-class classification and clustering on five real-world networks demonstrate that our model can efficiently update vertex representations and simultaneously achieve comparable or even better performance. |
Tasks | Network Embedding |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.05932v1 |
http://arxiv.org/pdf/1811.05932v1.pdf | |
PWC | https://paperswithcode.com/paper/streaming-network-embedding-through-local |
Repo | |
Framework | |
VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry
Title | VLocNet++: Deep Multitask Learning for Semantic Visual Localization and Odometry |
Authors | Noha Radwan, Abhinav Valada, Wolfram Burgard |
Abstract | Semantic understanding and localization are fundamental enablers of robot autonomy that have for the most part been tackled as disjoint problems. While deep learning has enabled recent breakthroughs across a wide spectrum of scene understanding tasks, its applicability to state estimation tasks has been limited due to the direct formulation that renders it incapable of encoding scene-specific constrains. In this work, we propose the VLocNet++ architecture that employs a multitask learning approach to exploit the inter-task relationship between learning semantics, regressing 6-DoF global pose and odometry, for the mutual benefit of each of these tasks. Our network overcomes the aforementioned limitation by simultaneously embedding geometric and semantic knowledge of the world into the pose regression network. We propose a novel adaptive weighted fusion layer to aggregate motion-specific temporal information and to fuse semantic features into the localization stream based on region activations. Furthermore, we propose a self-supervised warping technique that uses the relative motion to warp intermediate network representations in the segmentation stream for learning consistent semantics. Finally, we introduce a first-of-a-kind urban outdoor localization dataset with pixel-level semantic labels and multiple loops for training deep networks. Extensive experiments on the challenging Microsoft 7-Scenes benchmark and our DeepLoc dataset demonstrate that our approach exceeds the state-of-the-art outperforming local feature-based methods while simultaneously performing multiple tasks and exhibiting substantial robustness in challenging scenarios. |
Tasks | Scene Understanding, Visual Localization |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08366v6 |
http://arxiv.org/pdf/1804.08366v6.pdf | |
PWC | https://paperswithcode.com/paper/vlocnet-deep-multitask-learning-for-semantic |
Repo | |
Framework | |
Generalization and Expressivity for Deep Nets
Title | Generalization and Expressivity for Deep Nets |
Authors | Shao-Bo Lin |
Abstract | Along with the rapid development of deep learning in practice, the theoretical explanations for its success become urgent. Generalization and expressivity are two widely used measurements to quantify theoretical behaviors of deep learning. The expressivity focuses on finding functions expressible by deep nets but cannot be approximated by shallow nets with the similar number of neurons. It usually implies the large capacity. The generalization aims at deriving fast learning rate for deep nets. It usually requires small capacity to reduce the variance. Different from previous studies on deep learning, pursuing either expressivity or generalization, we take both factors into account to explore the theoretical advantages of deep nets. For this purpose, we construct a deep net with two hidden layers possessing excellent expressivity in terms of localized and sparse approximation. Then, utilizing the well known covering number to measure the capacity, we find that deep nets possess excellent expressive power (measured by localized and sparse approximation) without enlarging the capacity of shallow nets. As a consequence, we derive near optimal learning rates for implementing empirical risk minimization (ERM) on the constructed deep nets. These results theoretically exhibit the advantage of deep nets from learning theory viewpoints. |
Tasks | |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03772v2 |
http://arxiv.org/pdf/1803.03772v2.pdf | |
PWC | https://paperswithcode.com/paper/generalization-and-expressivity-for-deep-nets |
Repo | |
Framework | |
Long-term Visual Localization using Semantically Segmented Images
Title | Long-term Visual Localization using Semantically Segmented Images |
Authors | Erik Stenborg, Carl Toft, Lars Hammarstrand |
Abstract | Robust cross-seasonal localization is one of the major challenges in long-term visual navigation of autonomous vehicles. In this paper, we exploit recent advances in semantic segmentation of images, i.e., where each pixel is assigned a label related to the type of object it represents, to attack the problem of long-term visual localization. We show that semantically labeled 3-D point maps of the environment, together with semantically segmented images, can be efficiently used for vehicle localization without the need for detailed feature descriptors (SIFT, SURF, etc.). Thus, instead of depending on hand-crafted feature descriptors, we rely on the training of an image segmenter. The resulting map takes up much less storage space compared to a traditional descriptor based map. A particle filter based semantic localization solution is compared to one based on SIFT-features, and even with large seasonal variations over the year we perform on par with the larger and more descriptive SIFT-features, and are able to localize with an error below 1 m most of the time. |
Tasks | Autonomous Vehicles, Semantic Segmentation, Visual Localization, Visual Navigation |
Published | 2018-01-16 |
URL | http://arxiv.org/abs/1801.05269v2 |
http://arxiv.org/pdf/1801.05269v2.pdf | |
PWC | https://paperswithcode.com/paper/long-term-visual-localization-using |
Repo | |
Framework | |
Orthogonal Policy Gradient and Autonomous Driving Application
Title | Orthogonal Policy Gradient and Autonomous Driving Application |
Authors | Mincong Luo, Yin Tong, Jiachi Liu |
Abstract | One less addressed issue of deep reinforcement learning is the lack of generalization capability based on new state and new target, for complex tasks, it is necessary to give the correct strategy and evaluate all possible actions for current state. Fortunately, deep reinforcement learning has enabled enormous progress in both subproblems: giving the correct strategy and evaluating all actions based on the state. In this paper we present an approach called orthogonal policy gradient descent(OPGD) that can make agent learn the policy gradient based on the current state and the actions set, by which the agent can learn a policy network with generalization capability. we evaluate the proposed method on the 3D autonomous driving enviroment TORCS compared with the baseline model, detailed analyses of experimental results and proofs are also given. |
Tasks | Autonomous Driving |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06151v1 |
http://arxiv.org/pdf/1811.06151v1.pdf | |
PWC | https://paperswithcode.com/paper/orthogonal-policy-gradient-and-autonomous |
Repo | |
Framework | |
Robustness Meets Deep Learning: An End-to-End Hybrid Pipeline for Unsupervised Learning of Egomotion
Title | Robustness Meets Deep Learning: An End-to-End Hybrid Pipeline for Unsupervised Learning of Egomotion |
Authors | Alex Zihao Zhu, Wenxin Liu, Ziyun Wang, Vijay Kumar, Kostas Daniilidis |
Abstract | In this work, we propose a method that combines unsupervised deep learning predictions for optical flow and monocular disparity with a model based optimization procedure for instantaneous camera pose. Given the flow and disparity predictions from the network, we apply a RANSAC outlier rejection scheme to find an inlier set of flows and disparities, which we use to solve for the relative camera pose in a least squares fashion. We show that this pipeline is fully differentiable, allowing us to combine the pose with the network outputs as an additional unsupervised training loss to further refine the predicted flows and disparities. This method not only allows us to directly regress relative pose from the network outputs, but also automatically segments away pixels that do not fit the rigid scene assumptions that many unsupervised structure from motion methods apply, such as on independently moving objects. We evaluate our method on the KITTI dataset, and demonstrate state of the art results, even in the presence of challenging independently moving objects. |
Tasks | Optical Flow Estimation |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1812.08351v3 |
http://arxiv.org/pdf/1812.08351v3.pdf | |
PWC | https://paperswithcode.com/paper/robustness-meets-deep-learning-an-end-to-end |
Repo | |
Framework | |
Non-ergodic Convergence Analysis of Heavy-Ball Algorithms
Title | Non-ergodic Convergence Analysis of Heavy-Ball Algorithms |
Authors | Tao Sun, Penghang Yin, Dongsheng Li, Chun Huang, Lei Guan, Hao Jiang |
Abstract | In this paper, we revisit the convergence of the Heavy-ball method, and present improved convergence complexity results in the convex setting. We provide the first non-ergodic O(1/k) rate result of the Heavy-ball algorithm with constant step size for coercive objective functions. For objective functions satisfying a relaxed strongly convex condition, the linear convergence is established under weaker assumptions on the step size and inertial parameter than made in the existing literature. We extend our results to multi-block version of the algorithm with both the cyclic and stochastic update rules. In addition, our results can also be extended to decentralized optimization, where the ergodic analysis is not applicable. |
Tasks | |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01777v2 |
http://arxiv.org/pdf/1811.01777v2.pdf | |
PWC | https://paperswithcode.com/paper/non-ergodic-convergence-analysis-of-heavy |
Repo | |
Framework | |
Locally Interpretable Models and Effects based on Supervised Partitioning (LIME-SUP)
Title | Locally Interpretable Models and Effects based on Supervised Partitioning (LIME-SUP) |
Authors | Linwei Hu, Jie Chen, Vijayan N. Nair, Agus Sudjianto |
Abstract | Supervised Machine Learning (SML) algorithms such as Gradient Boosting, Random Forest, and Neural Networks have become popular in recent years due to their increased predictive performance over traditional statistical methods. This is especially true with large data sets (millions or more observations and hundreds to thousands of predictors). However, the complexity of the SML models makes them opaque and hard to interpret without additional tools. There has been a lot of interest recently in developing global and local diagnostics for interpreting and explaining SML models. In this paper, we propose locally interpretable models and effects based on supervised partitioning (trees) referred to as LIME-SUP. This is in contrast with the KLIME approach that is based on clustering the predictor space. We describe LIME-SUP based on fitting trees to the fitted response (LIM-SUP-R) as well as the derivatives of the fitted response (LIME-SUP-D). We compare the results with KLIME and describe its advantages using simulation and real data. |
Tasks | |
Published | 2018-06-02 |
URL | http://arxiv.org/abs/1806.00663v1 |
http://arxiv.org/pdf/1806.00663v1.pdf | |
PWC | https://paperswithcode.com/paper/locally-interpretable-models-and-effects |
Repo | |
Framework | |
Observing the Population Dynamics in GE by means of the Intrinsic Dimension
Title | Observing the Population Dynamics in GE by means of the Intrinsic Dimension |
Authors | Eric Medvet, Alberto Bartoli, Alessio Ansuini, Fabiano Tarlao |
Abstract | We explore the use of Intrinsic Dimension (ID) for gaining insights in how populations evolve in Evolutionary Algorithms. ID measures the minimum number of dimensions needed to accurately describe a dataset and its estimators are being used more and more in Machine Learning to cope with large datasets. We postulate that ID can provide information about population which is complimentary w.r.t.\ what (a simple measure of) diversity tells. We experimented with the application of ID to populations evolved with a recent variant of Grammatical Evolution. The preliminary results suggest that diversity and ID constitute two different points of view on the population dynamics. |
Tasks | |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02504v1 |
http://arxiv.org/pdf/1812.02504v1.pdf | |
PWC | https://paperswithcode.com/paper/observing-the-population-dynamics-in-ge-by |
Repo | |
Framework | |
Neural networks with dynamical coefficients and adjustable connections on the basis of integrated backpropagation
Title | Neural networks with dynamical coefficients and adjustable connections on the basis of integrated backpropagation |
Authors | M. N. Nazarov |
Abstract | We consider artificial neurons which will update their weight coefficients with an internal rule based on backpropagation, rather than using it as an external training procedure. To achieve this we include the backpropagation error estimate as a separate entity in all the neuron models and perform its exchange along the synaptic connections. In addition to this we add some special type of neurons with reference inputs, which will serve as a base source of error estimates for the whole network. Finally, we introduce a training control signal for all the neurons, which can enable the correction of weights and the exchange of error estimates. For recurrent neural networks we also demonstrate how to integrate backpropagation through time into their formalism with the help of some stack memory for reference inputs and external data inputs of neurons. Also, for widely used neural networks, such as long short-term memory, radial basis function networks, multilayer perceptrons and convolutional neural networks, we demonstrate their alternative description within the framework of our new formalism. |
Tasks | |
Published | 2018-05-19 |
URL | http://arxiv.org/abs/1805.07531v2 |
http://arxiv.org/pdf/1805.07531v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-with-dynamical-coefficients |
Repo | |
Framework | |
Efficient Spiking Neural Networks with Logarithmic Temporal Coding
Title | Efficient Spiking Neural Networks with Logarithmic Temporal Coding |
Authors | Ming Zhang, Nenggan Zheng, De Ma, Gang Pan, Zonghua Gu |
Abstract | A Spiking Neural Network (SNN) can be trained indirectly by first training an Artificial Neural Network (ANN) with the conventional backpropagation algorithm, then converting it into an SNN. The conventional rate-coding method for SNNs uses the number of spikes to encode magnitude of an activation value, and may be computationally inefficient due to the large number of spikes. Temporal-coding is typically more efficient by leveraging the timing of spikes to encode information. In this paper, we present Logarithmic Temporal Coding (LTC), where the number of spikes used to encode an activation value grows logarithmically with the activation value; and the accompanying Exponentiate-and-Fire (EF) spiking neuron model, which only involves efficient bit-shift and addition operations. Moreover, we improve the training process of ANN to compensate for approximation errors due to LTC. Experimental results indicate that the resulting SNN achieves competitive performance at significantly lower computational cost than related work. |
Tasks | |
Published | 2018-11-10 |
URL | http://arxiv.org/abs/1811.04233v1 |
http://arxiv.org/pdf/1811.04233v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-spiking-neural-networks-with |
Repo | |
Framework | |
Dialectical GAN for SAR Image Translation: From Sentinel-1 to TerraSAR-X
Title | Dialectical GAN for SAR Image Translation: From Sentinel-1 to TerraSAR-X |
Authors | Dongyang Ao, Corneliu Octavian Dumitru, Gottfried Schwarz, Mihai Datcu |
Abstract | Contrary to optical images, Synthetic Aperture Radar (SAR) images are in different electromagnetic spectrum where the human visual system is not accustomed to. Thus, with more and more SAR applications, the demand for enhanced high-quality SAR images has increased considerably. However, high-quality SAR images entail high costs due to the limitations of current SAR devices and their image processing resources. To improve the quality of SAR images and to reduce the costs of their generation, we propose a Dialectical Generative Adversarial Network (Dialectical GAN) to generate high-quality SAR images. This method is based on the analysis of hierarchical SAR information and the “dialectical” structure of GAN frameworks. As a demonstration, a typical example will be shown where a low-resolution SAR image (e.g., a Sentinel-1 image) with large ground coverage is translated into a high-resolution SAR image (e.g., a TerraSAR-X image). Three traditional algorithms are compared, and a new algorithm is proposed based on a network framework by combining conditional WGAN-GP (Wasserstein Generative Adversarial Network - Gradient Penalty) loss functions and Spatial Gram matrices under the rule of dialectics. Experimental results show that the SAR image translation works very well when we compare the results of our proposed method with the selected traditional methods. |
Tasks | |
Published | 2018-07-20 |
URL | http://arxiv.org/abs/1807.07778v1 |
http://arxiv.org/pdf/1807.07778v1.pdf | |
PWC | https://paperswithcode.com/paper/dialectical-gan-for-sar-image-translation |
Repo | |
Framework | |
Self-supervisory Signals for Object Discovery and Detection
Title | Self-supervisory Signals for Object Discovery and Detection |
Authors | Etienne Pot, Alexander Toshev, Jana Kosecka |
Abstract | In robotic applications, we often face the challenge of discovering new objects while having very little or no labelled training data. In this paper we explore the use of self-supervision provided by a robot traversing an environment to learn representations of encountered objects. Knowledge of ego-motion and depth perception enables the agent to effectively associate multiple object proposals, which serve as training data for learning object representations from unlabelled images. We demonstrate the utility of this representation in two ways. First, we can automatically discover objects by performing clustering in the learned embedding space. Each resulting cluster contains examples of one instance seen from various viewpoints and scales. Second, given a small number of labeled images, we can efficiently learn detectors for these labels. In the few-shot regime, these detectors have a substantially higher mAP of 0.22 compared to 0.12 of off-the-shelf standard detectors trained on this limited data. Thus, the proposed self-supervision results in effective environment specific object discovery and detection at no or very small human labeling cost. |
Tasks | |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.03370v1 |
http://arxiv.org/pdf/1806.03370v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervisory-signals-for-object-discovery |
Repo | |
Framework | |
Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines
Title | Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines |
Authors | Sean O. Settle, Manasa Bollavaram, Paolo D’Alberto, Elliott Delaye, Oscar Fernandez, Nicholas Fraser, Aaron Ng, Ashish Sirasao, Michael Wu |
Abstract | Deep learning as a means to inferencing has proliferated thanks to its versatility and ability to approach or exceed human-level accuracy. These computational models have seemingly insatiable appetites for computational resources not only while training, but also when deployed at scales ranging from data centers all the way down to embedded devices. As such, increasing consideration is being made to maximize the computational efficiency given limited hardware and energy resources and, as a result, inferencing with reduced precision has emerged as a viable alternative to the IEEE 754 Standard for Floating-Point Arithmetic. We propose a quantization scheme that allows inferencing to be carried out using arithmetic that is fundamentally more efficient when compared to even half-precision floating-point. Our quantization procedure is significant in that we determine our quantization scheme parameters by calibrating against its reference floating-point model using a single inference batch rather than (re)training and achieve end-to-end post quantization accuracies comparable to the reference model. |
Tasks | Quantization |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.07941v1 |
http://arxiv.org/pdf/1805.07941v1.pdf | |
PWC | https://paperswithcode.com/paper/quantizing-convolutional-neural-networks-for |
Repo | |
Framework | |
Approximating Real-Time Recurrent Learning with Random Kronecker Factors
Title | Approximating Real-Time Recurrent Learning with Random Kronecker Factors |
Authors | Asier Mujika, Florian Meier, Angelika Steger |
Abstract | Despite all the impressive advances of recurrent neural networks, sequential data is still in need of better modelling. Truncated backpropagation through time (TBPTT), the learning algorithm most widely used in practice, suffers from the truncation bias, which drastically limits its ability to learn long-term dependencies.The Real-Time Recurrent Learning algorithm (RTRL) addresses this issue, but its high computational requirements make it infeasible in practice. The Unbiased Online Recurrent Optimization algorithm (UORO) approximates RTRL with a smaller runtime and memory cost, but with the disadvantage of obtaining noisy gradients that also limit its practical applicability. In this paper we propose the Kronecker Factored RTRL (KF-RTRL) algorithm that uses a Kronecker product decomposition to approximate the gradients for a large class of RNNs. We show that KF-RTRL is an unbiased and memory efficient online learning algorithm. Our theoretical analysis shows that, under reasonable assumptions, the noise introduced by our algorithm is not only stable over time but also asymptotically much smaller than the one of the UORO algorithm. We also confirm these theoretical results experimentally. Further, we show empirically that the KF-RTRL algorithm captures long-term dependencies and almost matches the performance of TBPTT on real world tasks by training Recurrent Highway Networks on a synthetic string memorization task and on the Penn TreeBank task, respectively. These results indicate that RTRL based approaches might be a promising future alternative to TBPTT. |
Tasks | |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10842v2 |
http://arxiv.org/pdf/1805.10842v2.pdf | |
PWC | https://paperswithcode.com/paper/approximating-real-time-recurrent-learning |
Repo | |
Framework | |