October 18, 2019

3351 words 16 mins read

Paper Group ANR 416

Pieces of Eight: 8-bit Neural Machine Translation. In situ TensorView: In situ Visualization of Convolutional Neural Networks. Semi-Metrification of the Dynamic Time Warping Distance. Representation Learning by Reconstructing Neighborhoods. Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks. Deep Stereo Matching with Expl …

Pieces of Eight: 8-bit Neural Machine Translation


Title	Pieces of Eight: 8-bit Neural Machine Translation
Authors	Jerry Quinn, Miguel Ballesteros
Abstract	Neural machine translation has achieved levels of fluency and adequacy that would have been surprising a short time ago. Output quality is extremely relevant for industry purposes, however it is equally important to produce results in the shortest time possible, mainly for latency-sensitive applications and to control cloud hosting costs. In this paper we show the effectiveness of translating with 8-bit quantization for models that have been trained using 32-bit floating point values. Results show that 8-bit translation makes a non-negligible impact in terms of speed with no degradation in accuracy and adequacy.
Tasks	Machine Translation, Quantization
Published	2018-04-13
URL	http://arxiv.org/abs/1804.05038v1
PDF	http://arxiv.org/pdf/1804.05038v1.pdf
PWC	https://paperswithcode.com/paper/pieces-of-eight-8-bit-neural-machine
Repo
Framework

In situ TensorView: In situ Visualization of Convolutional Neural Networks


Title	In situ TensorView: In situ Visualization of Convolutional Neural Networks
Authors	Xinyu Chen, Qiang Guan, Li-Ta Lo, Simon Su, James Ahrens, Trilce Estrada
Abstract	Convolutional Neural Networks(CNNs) are complex systems. They are trained so they can adapt their internal connections to recognize images, texts and more. It is both interesting and helpful to visualize the dynamics within such deep artificial neural networks so that people can understand how these artificial networks are learning and making predictions. In the field of scientific simulations, visualization tools like Paraview have long been utilized to provide insights and understandings. We present in situ TensorView to visualize the training and functioning of CNNs as if they are systems of scientific simulations. In situ TensorView is a loosely coupled in situ visualization open framework that provides multiple viewers to help users to visualize and understand their networks. It leverages the capability of co-processing from Paraview to provide real-time visualization during training and predicting phases. This avoid heavy I/O overhead for visualizing large dynamic systems. Only a small number of lines of codes are injected in TensorFlow framework. The visualization can provide guidance to adjust the architecture of networks, or compress the pre-trained networks. We showcase visualizing the training of LeNet-5 and VGG16 using in situ TensorView.
Tasks
Published	2018-06-16
URL	http://arxiv.org/abs/1806.07382v1
PDF	http://arxiv.org/pdf/1806.07382v1.pdf
PWC	https://paperswithcode.com/paper/in-situ-tensorview-in-situ-visualization-of
Repo
Framework

Semi-Metrification of the Dynamic Time Warping Distance


Title	Semi-Metrification of the Dynamic Time Warping Distance
Authors	Brijnesh J. Jain
Abstract	The dynamic time warping (dtw) distance fails to satisfy the triangle inequality and the identity of indiscernibles. As a consequence, the dtw-distance is not warping-invariant, which in turn results in peculiarities in data mining applications. This article converts the dtw-distance to a semi-metric and shows that its canonical extension is warping-invariant. Empirical results indicate that the nearest-neighbor classifier in the proposed semi-metric space performs comparably to the same classifier in the standard dtw-space. To overcome the undesirable peculiarities of dtw-spaces, this result suggests to further explore the semi-metric space for data mining applications.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09964v2
PDF	http://arxiv.org/pdf/1808.09964v2.pdf
PWC	https://paperswithcode.com/paper/semi-metrification-of-the-dynamic-time
Repo
Framework

Representation Learning by Reconstructing Neighborhoods


Title	Representation Learning by Reconstructing Neighborhoods
Authors	Chin-Chia Michael Yeh, Yan Zhu, Evangelos E. Papalexakis, Abdullah Mueen, Eamonn Keogh
Abstract	Since its introduction, unsupervised representation learning has attracted a lot of attention from the research community, as it is demonstrated to be highly effective and easy-to-apply in tasks such as dimension reduction, clustering, visualization, information retrieval, and semi-supervised learning. In this work, we propose a novel unsupervised representation learning framework called neighbor-encoder, in which domain knowledge can be easily incorporated into the learning process without modifying the general encoder-decoder architecture of the classic autoencoder.In contrast to autoencoder, which reconstructs the input data itself, neighbor-encoder reconstructs the input data’s neighbors. As the proposed representation learning problem is essentially a neighbor reconstruction problem, domain knowledge can be easily incorporated in the form of an appropriate definition of similarity between objects. Based on that observation, our framework can leverage any off-the-shelf similarity search algorithms or side information to find the neighbor of an input object. Applications of other algorithms (e.g., association rule mining) in our framework are also possible, given that the appropriate definition of neighbor can vary in different contexts. We have demonstrated the effectiveness of our framework in many diverse domains, including images, text, and time series, and for various data mining tasks including classification, clustering, and visualization. Experimental results show that neighbor-encoder not only outperforms autoencoder in most of the scenarios we consider, but also achieves the state-of-the-art performance on text document clustering.
Tasks	Dimensionality Reduction, Information Retrieval, Representation Learning, Time Series, Unsupervised Representation Learning
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01557v2
PDF	http://arxiv.org/pdf/1811.01557v2.pdf
PWC	https://paperswithcode.com/paper/representation-learning-by-reconstructing
Repo
Framework

Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks


Title	Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks
Authors	Penghang Yin, Shuai Zhang, Jiancheng Lyu, Stanley Osher, Yingyong Qi, Jack Xin
Abstract	Quantized deep neural networks (QDNNs) are attractive due to their much lower memory storage and faster inference speed than their regular full precision counterparts. To maintain the same performance level especially at low bit-widths, QDNNs must be retrained. Their training involves piecewise constant activation functions and discrete weights, hence mathematical challenges arise. We introduce the notion of coarse gradient and propose the blended coarse gradient descent (BCGD) algorithm, for training fully quantized neural networks. Coarse gradient is generally not a gradient of any function but an artificial ascent direction. The weight update of BCGD goes by coarse gradient correction of a weighted average of the full precision weights and their quantization (the so-called blending), which yields sufficient descent in the objective value and thus accelerates the training. Our experiments demonstrate that this simple blending technique is very effective for quantization at extremely low bit-width such as binarization. In full quantization of ResNet-18 for ImageNet classification task, BCGD gives 64.36% top-1 accuracy with binary weights across all layers and 4-bit adaptive activation. If the weights in the first and last layers are kept in full precision, this number increases to 65.46%. As theoretical justification, we show convergence analysis of coarse gradient descent for a two-linear-layer neural network model with Gaussian input data, and prove that the expected coarse gradient correlates positively with the underlying true gradient.
Tasks	Quantization
Published	2018-08-15
URL	http://arxiv.org/abs/1808.05240v4
PDF	http://arxiv.org/pdf/1808.05240v4.pdf
PWC	https://paperswithcode.com/paper/blended-coarse-gradient-descent-for-full
Repo
Framework

Deep Stereo Matching with Explicit Cost Aggregation Sub-Architecture


Title	Deep Stereo Matching with Explicit Cost Aggregation Sub-Architecture
Authors	Lidong Yu, Yucheng Wang, Yuwei Wu, Yunde Jia
Abstract	Deep neural networks have shown excellent performance for stereo matching. Many efforts focus on the feature extraction and similarity measurement of the matching cost computation step while less attention is paid on cost aggregation which is crucial for stereo matching. In this paper, we present a learning-based cost aggregation method for stereo matching by a novel sub-architecture in the end-to-end trainable pipeline. We reformulate the cost aggregation as a learning process of the generation and selection of cost aggregation proposals which indicate the possible cost aggregation results. The cost aggregation sub-architecture is realized by a two-stream network: one for the generation of cost aggregation proposals, the other for the selection of the proposals. The criterion for the selection is determined by the low-level structure information obtained from a light convolutional network. The two-stream network offers a global view guidance for the cost aggregation to rectify the mismatching value stemming from the limited view of the matching cost computation. The comprehensive experiments on challenge datasets such as KITTI and Scene Flow show that our method outperforms the state-of-the-art methods.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2018-01-12
URL	http://arxiv.org/abs/1801.04065v1
PDF	http://arxiv.org/pdf/1801.04065v1.pdf
PWC	https://paperswithcode.com/paper/deep-stereo-matching-with-explicit-cost
Repo
Framework

Structure Learning of Deep Neural Networks with Q-Learning


Title	Structure Learning of Deep Neural Networks with Q-Learning
Authors	Guoqiang Zhong, Wencong Jiao, Wei Gao
Abstract	Recently, with convolutional neural networks gaining significant achievements in many challenging machine learning fields, hand-crafted neural networks no longer satisfy our requirements as designing a network will cost a lot, and automatically generating architectures has attracted increasingly more attention and focus. Some research on auto-generated networks has achieved promising results. However, they mainly aim at picking a series of single layers such as convolution or pooling layers one by one. There are many elegant and creative designs in the carefully hand-crafted neural networks, such as Inception-block in GoogLeNet, residual block in residual network and dense block in dense convolutional network. Based on reinforcement learning and taking advantages of the superiority of these networks, we propose a novel automatic process to design a multi-block neural network, whose architecture contains multiple types of blocks mentioned above, with the purpose to do structure learning of deep neural networks and explore the possibility whether different blocks can be composed together to form a well-behaved neural network. The optimal network is created by the Q-learning agent who is trained to sequentially pick different types of blocks. To verify the validity of our proposed method, we use the auto-generated multi-block neural network to conduct experiments on image benchmark datasets MNIST, SVHN and CIFAR-10 image classification task with restricted computational resources. The results demonstrate that our method is very effective, achieving comparable or better performance than hand-crafted networks and advanced auto-generated neural networks.
Tasks	Image Classification, Q-Learning
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13155v1
PDF	http://arxiv.org/pdf/1810.13155v1.pdf
PWC	https://paperswithcode.com/paper/structure-learning-of-deep-neural-networks
Repo
Framework

Genetic algorithm for optimal distribution in cities


Title	Genetic algorithm for optimal distribution in cities
Authors	Esteban Quintero, Mateo Sanchez, Nicolas Roldan, Mauricio Toro
Abstract	The problem to deal with in this project is the problem of routing electric vehicles, which consists of finding the best routes for this type of vehicle, so that they reach their destination, without running out of power and optimizing to the maximum transportation costs. The importance of this problem is mainly in the sector of shipments in the recent future, when obsolete energy sources are replaced with renewable sources, where each vehicle contains a number of packages that must be delivered at specific points in the city , but, being electric, they do not have an optimal battery life, so having the ideal routes traced is a vital aspect for the proper functioning of these. Now days you can see applications of this problem in the cleaning sector, specifically with the trucks responsible for collecting garbage, which aims to travel the entire city in the most efficient way, without letting excessive garbage accumulate.
Tasks
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05297v1
PDF	http://arxiv.org/pdf/1811.05297v1.pdf
PWC	https://paperswithcode.com/paper/genetic-algorithm-for-optimal-distribution-in
Repo
Framework


Title	TRLG: Fragile blind quad watermarking for image tamper detection and recovery by providing compact digests with quality optimized using LWT and GA
Authors	Behrouz Bolourian Haghighi, Amir Hossein Taherinia, Amir Hossein Mohajerzadeh
Abstract	In this paper, an efficient fragile blind quad watermarking scheme for image tamper detection and recovery based on lifting wavelet transform and genetic algorithm is proposed. TRLG generates four compact digests with super quality based on lifting wavelet transform and halftoning technique by distinguishing the types of image blocks. In other words, for each 2*2 non-overlap blocks, four chances for recovering destroyed blocks are considered. A special parameter estimation technique based on genetic algorithm is performed to improve and optimize the quality of digests and watermarked image. Furthermore, CCS map is used to determine the mapping block for embedding information, encrypting and confusing the embedded information. In order to improve the recovery rate, Mirror-aside and Partner-block are proposed. The experiments that have been conducted to evaluate the performance of TRLG proved the superiority in terms of quality of the watermarked and recovered image, tamper localization and security compared with state-of-the-art methods. The results indicate that the PSNR and SSIM of the watermarked image are about 46 dB and approximately one, respectively. Also, the mean of PSNR and SSIM of several recovered images which has been destroyed about 90% is reached to 24 dB and 0.86, respectively.
Tasks
Published	2018-03-07
URL	http://arxiv.org/abs/1803.02623v1
PDF	http://arxiv.org/pdf/1803.02623v1.pdf
PWC	https://paperswithcode.com/paper/trlg-fragile-blind-quad-watermarking-for
Repo
Framework

Communication-Efficient Projection-Free Algorithm for Distributed Optimization


Title	Communication-Efficient Projection-Free Algorithm for Distributed Optimization
Authors	Yan Li, Chao Qu, Huan Xu
Abstract	Distributed optimization has gained a surge of interest in recent years. In this paper we propose a distributed projection free algorithm named Distributed Conditional Gradient Sliding(DCGS). Compared to the state-of-the-art distributed Frank-Wolfe algorithm, our algorithm attains the same communication complexity under much more realistic assumptions. In contrast to the consensus based algorithm, DCGS is based on the primal-dual algorithm, yielding a modular analysis that can be exploited to improve linear oracle complexity whenever centralized Frank-Wolfe can be improved. We demonstrate this advantage and show that the linear oracle complexity can be reduced to almost the same order of magnitude as the communication complexity, when the feasible set is polyhedral. Finally we present experimental results on Lasso and matrix completion, demonstrating significant performance improvement compared to the existing distributed Frank-Wolfe algorithm.
Tasks	Distributed Optimization, Matrix Completion
Published	2018-05-20
URL	http://arxiv.org/abs/1805.07841v1
PDF	http://arxiv.org/pdf/1805.07841v1.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-projection-free
Repo
Framework

Towards a more efficient use of process and product traceability data for continuous improvement of industrial performances


Title	Towards a more efficient use of process and product traceability data for continuous improvement of industrial performances
Authors	Thierno Diallo, Sébastien Henry, Yacine Ouzrout
Abstract	Nowadays all industrial sectors are increasingly faced with the explosion in the amount of data. Therefore, it raises the question of the efficient use of this large amount of data. In this research work, we are concerned with process and product traceability data. In some sectors (e.g. pharmaceutical and agro-food), the collection and storage of these data are required. Beyond this constraint (regulatory and / or contractual), we are interested in the use of these data for continuous improvements of industrial performances. Two research axes were identified: product recall and responsiveness towards production hazards. For the first axis, a procedure for product recall exploiting traceability data will be propose. The development of detection and prognosis functions combining process and product data is envisaged for the second axis.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13141v1
PDF	http://arxiv.org/pdf/1810.13141v1.pdf
PWC	https://paperswithcode.com/paper/towards-a-more-efficient-use-of-process-and
Repo
Framework

Improving on Q & A Recurrent Neural Networks Using Noun-Tagging


Title	Improving on Q & A Recurrent Neural Networks Using Noun-Tagging
Authors	Erik Partridge, Jack Sklar, Omar El-lakany
Abstract	Often, more time is spent on finding a model that works well, rather than tuning the model and working directly with the dataset. Our research began as an attempt to improve upon a simple Recurrent Neural Network for answering “simple” first-order questions (QA-RNN), developed by Ferhan Ture and Oliver Jojic, from Comcast Labs, using the SimpleQuestions dataset. Their baseline model, a bidirectional, 2-layer LSTM RNN and a GRU RNN, have accuracies of 0.94 and 0.90, for entity detection and relation prediction, respectively. We fine tuned these models by doing substantial hyper-parameter tuning, getting resulting accuracies of 0.70 and 0.80, for entity detection and relation prediction, respectively. An accuracy of 0.984 was obtained on entity detection using a 1-layer LSTM, where preprocessing was done by removing all words not part of a noun chunk from the question. 100% of the dataset was available for relation prediction, but only 20% of the dataset, was available for entity detection, which we believe to be much of the reason for our initial difficulties in replicating their result, despite the fact we were able to improve on their entity detection results.
Tasks
Published	2018-07-12
URL	http://arxiv.org/abs/1807.04778v1
PDF	http://arxiv.org/pdf/1807.04778v1.pdf
PWC	https://paperswithcode.com/paper/improving-on-q-a-recurrent-neural-networks
Repo
Framework

Top-Down Feedback for Crowd Counting Convolutional Neural Network


Title	Top-Down Feedback for Crowd Counting Convolutional Neural Network
Authors	Deepak Babu Sam, R. Venkatesh Babu
Abstract	Counting people in dense crowds is a demanding task even for humans. This is primarily due to the large variability in appearance of people. Often people are only seen as a bunch of blobs. Occlusions, pose variations and background clutter further compound the difficulty. In this scenario, identifying a person requires larger spatial context and semantics of the scene. But the current state-of-the-art CNN regressors for crowd counting are feedforward and use only limited spatial context to detect people. They look for local crowd patterns to regress the crowd density map, resulting in false predictions. Hence, we propose top-down feedback to correct the initial prediction of the CNN. Our architecture consists of a bottom-up CNN along with a separate top-down CNN to generate feedback. The bottom-up network, which regresses the crowd density map, has two columns of CNN with different receptive fields. Features from various layers of the bottom-up CNN are fed to the top-down network. The feedback, thus generated, is applied on the lower layers of the bottom-up network in the form of multiplicative gating. This masking weighs activations of the bottom-up network at spatial as well as feature levels to correct the density prediction. We evaluate the performance of our model on all major crowd datasets and show the effectiveness of top-down feedback.
Tasks	Crowd Counting
Published	2018-07-24
URL	http://arxiv.org/abs/1807.08881v2
PDF	http://arxiv.org/pdf/1807.08881v2.pdf
PWC	https://paperswithcode.com/paper/top-down-feedback-for-crowd-counting
Repo
Framework

Deep Learning Inference on Embedded Devices: Fixed-Point vs Posit


Title	Deep Learning Inference on Embedded Devices: Fixed-Point vs Posit
Authors	Seyed H. F. Langroudi, Tej Pandit, Dhireesha Kudithipudi
Abstract	Performing the inference step of deep learning in resource constrained environments, such as embedded devices, is challenging. Success requires optimization at both software and hardware levels. Low precision arithmetic and specifically low precision fixed-point number systems have become the standard for performing deep learning inference. However, representing non-uniform data and distributed parameters (e.g. weights) by using uniformly distributed fixed-point values is still a major drawback when using this number system. Recently, the posit number system was proposed, which represents numbers in a non-uniform manner. Therefore, in this paper we are motivated to explore using the posit number system to represent the weights of Deep Convolutional Neural Networks. However, we do not apply any quantization techniques and hence the network weights do not require re-training. The results of this exploration show that using the posit number system outperformed the fixed point number system in terms of accuracy and memory utilization.
Tasks	Quantization
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08624v1
PDF	http://arxiv.org/pdf/1805.08624v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-inference-on-embedded-devices
Repo
Framework

Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations


Title	Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations
Authors	Liu Yang, Dongkun Zhang, George Em Karniadakis
Abstract	We developed a new class of physics-informed generative adversarial networks (PI-GANs) to solve in a unified manner forward, inverse and mixed stochastic problems based on a limited number of scattered measurements. Unlike standard GANs relying only on data for training, here we encoded into the architecture of GANs the governing physical laws in the form of stochastic differential equations (SDEs) using automatic differentiation. In particular, we applied Wasserstein GANs with gradient penalty (WGAN-GP) for its enhanced stability compared to vanilla GANs. We first tested WGAN-GP in approximating Gaussian processes of different correlation lengths based on data realizations collected from simultaneous reads at sparsely placed sensors. We obtained good approximation of the generated stochastic processes to the target ones even for a mismatch between the input noise dimensionality and the effective dimensionality of the target stochastic processes. We also studied the overfitting issue for both the discriminator and generator, and we found that overfitting occurs also in the generator in addition to the discriminator as previously reported. Subsequently, we considered the solution of elliptic SDEs requiring approximations of three stochastic processes, namely the solution, the forcing, and the diffusion coefficient. We used three generators for the PI-GANs, two of them were feed forward deep neural networks (DNNs) while the other one was the neural network induced by the SDE. Depending on the data, we employed one or multiple feed forward DNNs as the discriminators in PI-GANs. Here, we have demonstrated the accuracy and effectiveness of PI-GANs in solving SDEs for up to 30 dimensions, but in principle, PI-GANs could tackle very high dimensional problems given more sensor data with low-polynomial growth in computational cost.
Tasks	Gaussian Processes
Published	2018-11-05
URL	http://arxiv.org/abs/1811.02033v1
PDF	http://arxiv.org/pdf/1811.02033v1.pdf
PWC	https://paperswithcode.com/paper/physics-informed-generative-adversarial
Repo
Framework