Paper Group ANR 172
Artificial Retina Using A Hybrid Neural Network With Spatial Transform Capability. A Hybrid Variational Autoencoder for Collaborative Filtering. Sample-Efficient Policy Learning based on Completely Behavior Cloning. Flow-Grounded Spatial-Temporal Video Prediction from Still Images. Video Prediction with Appearance and Motion Conditions. Robust 3D H …
Artificial Retina Using A Hybrid Neural Network With Spatial Transform Capability
Title | Artificial Retina Using A Hybrid Neural Network With Spatial Transform Capability |
Authors | Richard Wood, Alexander McGlashan, C. B. Moon, W. Y. Kim |
Abstract | This paper covers the design and programming of a hybrid (digital/analog) neural network to function as an artificial retina with the ability to perform a spatial discrete cosine transform. We describe the structure of the circuit, which uses an analog cell that is interlinked using a programmable digital array. The paper is broken into three main parts. First, we present the results of a Matlab simulation. Then we show the circuit simulation in Spice. This is followed by a demonstration of the practical device. This system has intentionally separated components with the specialty analog circuits being separated from the readily available digital field programmable gate array (FPGA) components. Further development includes the use of rapid manufacture-able organic electronics used for the analog components. The planned uses for this platform include crowd development of software that uses the underlying pulse based processing. The development package will include simulators in the form of Matlab and Spice type software platforms. |
Tasks | |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10126v1 |
http://arxiv.org/pdf/1811.10126v1.pdf | |
PWC | https://paperswithcode.com/paper/artificial-retina-using-a-hybrid-neural |
Repo | |
Framework | |
A Hybrid Variational Autoencoder for Collaborative Filtering
Title | A Hybrid Variational Autoencoder for Collaborative Filtering |
Authors | Kilol Gupta, Mukund Yelahanka Raghuprasad, Pankhuri Kumar |
Abstract | In today’s day and age when almost every industry has an online presence with users interacting in online marketplaces, personalized recommendations have become quite important. Traditionally, the problem of collaborative filtering has been tackled using Matrix Factorization which is linear in nature. We extend the work of [11] on using variational autoencoders (VAEs) for collaborative filtering with implicit feedback by proposing a hybrid, multi-modal approach. Our approach combines movie embeddings (learned from a sibling VAE network) with user ratings from the Movielens 20M dataset and applies it to the task of movie recommendation. We empirically show how the VAE network is empowered by incorporating movie embeddings. We also visualize movie and user embeddings by clustering their latent representations obtained from a VAE. |
Tasks | |
Published | 2018-07-14 |
URL | http://arxiv.org/abs/1808.01006v2 |
http://arxiv.org/pdf/1808.01006v2.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-variational-autoencoder-for |
Repo | |
Framework | |
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Title | Sample-Efficient Policy Learning based on Completely Behavior Cloning |
Authors | Qiming Zou, Ling Wang, Ke Lu, Yu Li |
Abstract | Direct policy search is one of the most important algorithm of reinforcement learning. However, learning from scratch needs a large amount of experience data and can be easily prone to poor local optima. In addition to that, a partially trained policy tends to perform dangerous action to agent and environment. In order to overcome these challenges, this paper proposed a policy initialization algorithm called Policy Learning based on Completely Behavior Cloning (PLCBC). PLCBC first transforms the Model Predictive Control (MPC) controller into a piecewise affine (PWA) function using multi-parametric programming, and uses a neural network to express this function. By this way, PLCBC can completely clone the MPC controller without any performance loss, and is totally training-free. The experiments show that this initialization strategy can help agent learn at the high reward state region, and converge faster and better. |
Tasks | |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03853v1 |
http://arxiv.org/pdf/1811.03853v1.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-policy-learning-based-on |
Repo | |
Framework | |
Flow-Grounded Spatial-Temporal Video Prediction from Still Images
Title | Flow-Grounded Spatial-Temporal Video Prediction from Still Images |
Authors | Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, Ming-Hsuan Yang |
Abstract | Existing video prediction methods mainly rely on observing multiple historical frames or focus on predicting the next one-frame. In this work, we study the problem of generating consecutive multiple future frames by observing one single still image only. We formulate the multi-frame prediction task as a multiple time step flow (multi-flow) prediction phase followed by a flow-to-frame synthesis phase. The multi-flow prediction is modeled in a variational probabilistic manner with spatial-temporal relationships learned through 3D convolutions. The flow-to-frame synthesis is modeled as a generative process in order to keep the predicted results lying closer to the manifold shape of real video sequence. Such a two-phase design prevents the model from directly looking at the high-dimensional pixel space of the frame sequence and is demonstrated to be more effective in predicting better and diverse results. Extensive experimental results on videos with different types of motion show that the proposed algorithm performs favorably against existing methods in terms of quality, diversity and human perceptual evaluation. |
Tasks | Video Prediction |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09755v2 |
http://arxiv.org/pdf/1807.09755v2.pdf | |
PWC | https://paperswithcode.com/paper/flow-grounded-spatial-temporal-video |
Repo | |
Framework | |
Video Prediction with Appearance and Motion Conditions
Title | Video Prediction with Appearance and Motion Conditions |
Authors | Yunseok Jang, Gunhee Kim, Yale Song |
Abstract | Video prediction aims to generate realistic future frames by learning dynamic visual patterns. One fundamental challenge is to deal with future uncertainty: How should a model behave when there are multiple correct, equally probable future? We propose an Appearance-Motion Conditional GAN to address this challenge. We provide appearance and motion information as conditions that specify how the future may look like, reducing the level of uncertainty. Our model consists of a generator, two discriminators taking charge of appearance and motion pathways, and a perceptual ranking module that encourages videos of similar conditions to look similar. To train our model, we develop a novel conditioning scheme that consists of different combinations of appearance and motion conditions. We evaluate our model using facial expression and human action datasets and report favorable results compared to existing methods. |
Tasks | Video Prediction |
Published | 2018-07-07 |
URL | http://arxiv.org/abs/1807.02635v1 |
http://arxiv.org/pdf/1807.02635v1.pdf | |
PWC | https://paperswithcode.com/paper/video-prediction-with-appearance-and-motion |
Repo | |
Framework | |
Robust 3D Human Motion Reconstruction Via Dynamic Template Construction
Title | Robust 3D Human Motion Reconstruction Via Dynamic Template Construction |
Authors | Zhong Li, Yu Ji, Wei Yang, Jinwei Ye, Jingyi Yu |
Abstract | In multi-view human body capture systems, the recovered 3D geometry or even the acquired imagery data can be heavily corrupted due to occlusions, noise, limited field of- view, etc. Direct estimation of 3D pose, body shape or motion on these low-quality data has been traditionally challenging.In this paper, we present a graph-based non-rigid shape registration framework that can simultaneously recover 3D human body geometry and estimate pose/motion at high fidelity.Our approach first generates a global full-body template by registering all poses in the acquired motion sequence.We then construct a deformable graph by utilizing the rigid components in the global template. We directly warp the global template graph back to each motion frame in order to fill in missing geometry. Specifically, we combine local rigidity and temporal coherence constraints to maintain geometry and motion consistencies. Comprehensive experiments on various scenes show that our method is accurate and robust even in the presence of drastic motions. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10434v1 |
http://arxiv.org/pdf/1801.10434v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-3d-human-motion-reconstruction-via |
Repo | |
Framework | |
Uncorrelated Feature Encoding for Faster Image Style Transfer
Title | Uncorrelated Feature Encoding for Faster Image Style Transfer |
Authors | Minseong Kim, Jongju Shin, Myung-Cheol Roh, Hyun-Chul Choi |
Abstract | Recent fast style transfer methods use a pre-trained convolutional neural network as a feature encoder and a perceptual loss network. Although the pre-trained network is used to generate responses of receptive fields effective for representing style and content of image, it is not optimized for image style transfer but rather for image classification. Furthermore, it also requires a time-consuming and correlation-considering feature alignment process for image style transfer because of its inter-channel correlation. In this paper, we propose an end-to-end learning method which optimizes an encoder/decoder network for the purpose of style transfer as well as relieves the feature alignment complexity from considering inter-channel correlation. We used uncorrelation loss, i.e., the total correlation coefficient between the responses of different encoder channels, with style and content losses for training style transfer network. This makes the encoder network to be trained to generate inter-channel uncorrelated features and to be optimized for the task of image style transfer which maintained the quality of image style only with a light-weighted and correlation-unaware feature alignment process. Moreover, our method drastically reduced redundant channels of the encoded feature and this resulted in the efficient size of structure of network and faster forward processing speed. Our method can also be applied to cascade network scheme for multiple scaled style transferring and allows user-control of style strength by using a content-style trade-off parameter. |
Tasks | Image Classification, Style Transfer |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01493v1 |
http://arxiv.org/pdf/1807.01493v1.pdf | |
PWC | https://paperswithcode.com/paper/uncorrelated-feature-encoding-for-faster |
Repo | |
Framework | |
MVOR: A Multi-view RGB-D Operating Room Dataset for 2D and 3D Human Pose Estimation
Title | MVOR: A Multi-view RGB-D Operating Room Dataset for 2D and 3D Human Pose Estimation |
Authors | Vinkle Srivastav, Thibaut Issenhuth, Abdolrahim Kadkhodamohammadi, Michel de Mathelin, Afshin Gangi, Nicolas Padoy |
Abstract | Person detection and pose estimation is a key requirement to develop intelligent context-aware assistance systems. To foster the development of human pose estimation methods and their applications in the Operating Room (OR), we release the Multi-View Operating Room (MVOR) dataset, the first public dataset recorded during real clinical interventions. It consists of 732 synchronized multi-view frames recorded by three RGB-D cameras in a hybrid OR. It also includes the visual challenges present in such environments, such as occlusions and clutter. We provide camera calibration parameters, color and depth frames, human bounding boxes, and 2D/3D pose annotations. In this paper, we present the dataset, its annotations, as well as baseline results from several recent person detection and 2D/3D pose estimation methods. Since we need to blur some parts of the images to hide identity and nudity in the released dataset, we also present a comparative study of how the baselines have been impacted by the blurring. Results show a large margin for improvement and suggest that the MVOR dataset can be useful to compare the performance of the different methods. |
Tasks | 3D Human Pose Estimation, 3D Pose Estimation, Calibration, Human Detection, Pose Estimation |
Published | 2018-08-24 |
URL | https://arxiv.org/abs/1808.08180v2 |
https://arxiv.org/pdf/1808.08180v2.pdf | |
PWC | https://paperswithcode.com/paper/mvor-a-multi-view-rgb-d-operating-room |
Repo | |
Framework | |
Data-Driven Methods for Solving Algebra Word Problems
Title | Data-Driven Methods for Solving Algebra Word Problems |
Authors | Benjamin Robaidek, Rik Koncel-Kedziorski, Hannaneh Hajishirzi |
Abstract | We explore contemporary, data-driven techniques for solving math word problems over recent large-scale datasets. We show that well-tuned neural equation classifiers can outperform more sophisticated models such as sequence to sequence and self-attention across these datasets. Our error analysis indicates that, while fully data driven models show some promise, semantic and world knowledge is necessary for further advances. |
Tasks | |
Published | 2018-04-28 |
URL | http://arxiv.org/abs/1804.10718v1 |
http://arxiv.org/pdf/1804.10718v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-methods-for-solving-algebra-word |
Repo | |
Framework | |
Neuromodulated Learning in Deep Neural Networks
Title | Neuromodulated Learning in Deep Neural Networks |
Authors | Dennis G Wilson, Sylvain Cussat-Blanc, Hervé Luga, Kyle Harrington |
Abstract | In the brain, learning signals change over time and synaptic location, and are applied based on the learning history at the synapse, in the complex process of neuromodulation. Learning in artificial neural networks, on the other hand, is shaped by hyper-parameters set before learning starts, which remain static throughout learning, and which are uniform for the entire network. In this work, we propose a method of deep artificial neuromodulation which applies the concepts of biological neuromodulation to stochastic gradient descent. Evolved neuromodulatory dynamics modify learning parameters at each layer in a deep neural network over the course of the network’s training. We show that the same neuromodulatory dynamics can be applied to different models and can scale to new problems not encountered during evolution. Finally, we examine the evolved neuromodulation, showing that evolution found dynamic, location-specific learning strategies. |
Tasks | |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.03365v1 |
http://arxiv.org/pdf/1812.03365v1.pdf | |
PWC | https://paperswithcode.com/paper/neuromodulated-learning-in-deep-neural |
Repo | |
Framework | |
Cost-Sensitive Learning for Predictive Maintenance
Title | Cost-Sensitive Learning for Predictive Maintenance |
Authors | Stephan Spiegel, Fabian Mueller, Dorothea Weismann, John Bird |
Abstract | In predictive maintenance, model performance is usually assessed by means of precision, recall, and F1-score. However, employing the model with best performance, e.g. highest F1-score, does not necessarily result in minimum maintenance cost, but can instead lead to additional expenses. Thus, we propose to perform model selection based on the economic costs associated with the particular maintenance application. We show that cost-sensitive learning for predictive maintenance can result in significant cost reduction and fault tolerant policies, since it allows to incorporate various business constraints and requirements. |
Tasks | Model Selection |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1809.10979v1 |
http://arxiv.org/pdf/1809.10979v1.pdf | |
PWC | https://paperswithcode.com/paper/cost-sensitive-learning-for-predictive |
Repo | |
Framework | |
A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)
Title | A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN) |
Authors | Yi-Te Hsu, Yu-Chen Lin, Szu-Wei Fu, Yu Tsao, Tei-Wei Kuo |
Abstract | Numerous studies have investigated the effectiveness of neural network quantization on pattern classification tasks. The present study, for the first time, investigated the performance of speech enhancement (a regression task in speech processing) using a novel exponent-only floating-point quantized neural network (EOFP-QNN). The proposed EOFP-QNN consists of two stages: mantissa-quantization and exponent-quantization. In the mantissa-quantization stage, EOFP-QNN learns how to quantize the mantissa bits of the model parameters while preserving the regression accuracy using the least mantissa precision. In the exponent-quantization stage, the exponent part of the parameters is further quantized without causing any additional performance degradation. We evaluated the proposed EOFP quantization technique on two types of neural networks, namely, bidirectional long short-term memory (BLSTM) and fully convolutional neural network (FCN), on a speech enhancement task. Experimental results showed that the model sizes can be significantly reduced (the model sizes of the quantized BLSTM and FCN models were only 18.75% and 21.89%, respectively, compared to those of the original models) while maintaining satisfactory speech-enhancement performance. |
Tasks | Quantization, Speech Enhancement |
Published | 2018-08-17 |
URL | http://arxiv.org/abs/1808.06474v4 |
http://arxiv.org/pdf/1808.06474v4.pdf | |
PWC | https://paperswithcode.com/paper/a-study-on-speech-enhancement-using-exponent |
Repo | |
Framework | |
Relative Importance Sampling For Off-Policy Actor-Critic in Deep Reinforcement Learning
Title | Relative Importance Sampling For Off-Policy Actor-Critic in Deep Reinforcement Learning |
Authors | Mahammad Humayoo, Xueqi Cheng |
Abstract | Off-policy learning is more unstable compared to on-policy learning in reinforcement learning (RL). One reason for the instability of off-policy learning is a discrepancy between the target ($\pi$) and behavior (b) policy distributions. The discrepancy between $\pi$ and b distributions can be alleviated by employing a smooth variant of the importance sampling (IS), such as the relative importance sampling (RIS). RIS has parameter $\beta\in[0, 1]$ which controls smoothness. To cope with instability, we present the first relative importance sampling-off-policy actor-critic (RIS-Off-PAC) model-free algorithms in RL. In our method, the network yields a target policy (the actor), a value function (the critic) assessing the current policy ($\pi$) using samples drawn from behavior policy. We use action value generated from the behavior policy in reward function to train our algorithm rather than from the target policy. We also use deep neural networks to train both actor and critic. We evaluated our algorithm on a number of Open AI Gym benchmark problems and demonstrate better or comparable performance to several state-of-the-art RL baselines. |
Tasks | |
Published | 2018-10-30 |
URL | https://arxiv.org/abs/1810.12558v6 |
https://arxiv.org/pdf/1810.12558v6.pdf | |
PWC | https://paperswithcode.com/paper/relative-importance-sampling-for-off-policy |
Repo | |
Framework | |
Learning representations of molecules and materials with atomistic neural networks
Title | Learning representations of molecules and materials with atomistic neural networks |
Authors | Kristof T. Schütt, Alexandre Tkatchenko, Klaus-Robert Müller |
Abstract | Deep Learning has been shown to learn efficient representations for structured data such as image, text or audio. In this chapter, we present neural network architectures that are able to learn efficient representations of molecules and materials. In particular, the continuous-filter convolutional network SchNet accurately predicts chemical properties across compositional and configurational space on a variety of datasets. Beyond that, we analyze the obtained representations to find evidence that their spatial and chemical properties agree with chemical intuition. |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04690v1 |
http://arxiv.org/pdf/1812.04690v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-representations-of-molecules-and |
Repo | |
Framework | |
Expert Finding in Heterogeneous Bibliographic Networks with Locally-trained Embeddings
Title | Expert Finding in Heterogeneous Bibliographic Networks with Locally-trained Embeddings |
Authors | Huan Gui, Qi Zhu, Liyuan Liu, Aston Zhang, Jiawei Han |
Abstract | Expert finding is an important task in both industry and academia. It is challenging to rank candidates with appropriate expertise for various queries. In addition, different types of objects interact with one another, which naturally forms heterogeneous information networks. We study the task of expert finding in heterogeneous bibliographical networks based on two aspects: textual content analysis and authority ranking. Regarding the textual content analysis, we propose a new method for query expansion via locally-trained embedding learning with concept hierarchy as guidance, which is particularly tailored for specific queries with narrow semantic meanings. Compared with global embedding learning, locally-trained embedding learning projects the terms into a latent semantic space constrained on relevant topics, therefore it preserves more precise and subtle information for specific queries. Considering the candidate ranking, the heterogeneous information network structure, while being largely ignored in the previous studies of expert finding, provides additional information. Specifically, different types of interactions among objects play different roles. We propose a ranking algorithm to estimate the authority of objects in the network, treating each strongly-typed edge type individually. To demonstrate the effectiveness of the proposed framework, we apply the proposed method to a large-scale bibliographical dataset with over two million entries and one million researcher candidates. The experiment results show that the proposed framework outperforms existing methods for both general and specific queries. |
Tasks | |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03370v1 |
http://arxiv.org/pdf/1803.03370v1.pdf | |
PWC | https://paperswithcode.com/paper/expert-finding-in-heterogeneous-bibliographic |
Repo | |
Framework | |