Paper Group AWR 168
Autoencoders, Kernels, and Multilayer Perceptrons for Electron Micrograph Restoration and Compression. Learning 3D Shape Completion under Weak Supervision. Denoising of 3-D Magnetic Resonance Images Using a Residual Encoder-Decoder Wasserstein Generative Adversarial Network. A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing. Co …
Autoencoders, Kernels, and Multilayer Perceptrons for Electron Micrograph Restoration and Compression
Title | Autoencoders, Kernels, and Multilayer Perceptrons for Electron Micrograph Restoration and Compression |
Authors | Jeffrey M. Ede |
Abstract | We present 14 autoencoders, 15 kernels and 14 multilayer perceptrons for electron micrograph restoration and compression. These have been trained for transmission electron microscopy (TEM), scanning transmission electron microscopy (STEM) and for both (TEM+STEM). TEM autoencoders have been trained for 1$\times$, 4$\times$, 16$\times$ and 64$\times$ compression, STEM autoencoders for 1$\times$, 4$\times$ and 16$\times$ compression and TEM+STEM autoencoders for 1$\times$, 2$\times$, 4$\times$, 8$\times$, 16$\times$, 32$\times$ and 64$\times$ compression. Kernels and multilayer perceptrons have been trained to approximate the denoising effect of the 4$\times$ compression autoencoders. Kernels for input sizes of 3, 5, 7, 11 and 15 have been fitted for TEM, STEM and TEM+STEM. TEM multilayer perceptrons have been trained with 1 hidden layer for input sizes of 3, 5 and 7 and with 2 hidden layers for input sizes of 5 and 7. STEM multilayer perceptrons have been trained with 1 hidden layer for input sizes of 3, 5 and 7. TEM+STEM multilayer perceptrons have been trained with 1 hidden layer for input sizes of 3, 5, 7 and 11 and with 2 hidden layers for input sizes of 3 and 7. Our code, example usage and pre-trained models are available at https://github.com/Jeffrey-Ede/Denoising-Kernels-MLPs-Autoencoders |
Tasks | Denoising |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09916v1 |
http://arxiv.org/pdf/1808.09916v1.pdf | |
PWC | https://paperswithcode.com/paper/autoencoders-kernels-and-multilayer |
Repo | https://github.com/Jeffrey-Ede/Denoising-Kernels-MLPs-Autoencoders |
Framework | tf |
Learning 3D Shape Completion under Weak Supervision
Title | Learning 3D Shape Completion under Weak Supervision |
Authors | David Stutz, Andreas Geiger |
Abstract | We address the problem of 3D shape completion from sparse and noisy point clouds, a fundamental problem in computer vision and robotics. Recent approaches are either data-driven or learning-based: Data-driven approaches rely on a shape model whose parameters are optimized to fit the observations; Learning-based approaches, in contrast, avoid the expensive optimization step by learning to directly predict complete shapes from incomplete observations in a fully-supervised setting. However, full supervision is often not available in practice. In this work, we propose a weakly-supervised learning-based approach to 3D shape completion which neither requires slow optimization nor direct supervision. While we also learn a shape prior on synthetic data, we amortize, i.e., learn, maximum likelihood fitting using deep neural networks resulting in efficient shape completion without sacrificing accuracy. On synthetic benchmarks based on ShapeNet and ModelNet as well as on real robotics data from KITTI and Kinect, we demonstrate that the proposed amortized maximum likelihood approach is able to compete with recent fully supervised baselines and outperforms data-driven approaches, while requiring less supervision and being significantly faster. |
Tasks | |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07290v2 |
http://arxiv.org/pdf/1805.07290v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-3d-shape-completion-under-weak |
Repo | https://github.com/davidstutz/aml-improved-shape-completion |
Framework | pytorch |
Denoising of 3-D Magnetic Resonance Images Using a Residual Encoder-Decoder Wasserstein Generative Adversarial Network
Title | Denoising of 3-D Magnetic Resonance Images Using a Residual Encoder-Decoder Wasserstein Generative Adversarial Network |
Authors | Maosong Ran, Jinrong Hu, Yang Chen, Hu Chen, Huaiqiang Sun, Jiliu Zhou, Yi Zhang |
Abstract | Structure-preserved denoising of 3D magnetic resonance imaging (MRI) images is a critical step in medical image analysis. Over the past few years, many algorithms with impressive performances have been proposed. In this paper, inspired by the idea of deep learning, we introduce an MRI denoising method based on the residual encoder-decoder Wasserstein generative adversarial network (RED-WGAN). Specifically, to explore the structure similarity between neighboring slices, a 3D configuration is utilized as the basic processing unit. Residual autoencoders combined with deconvolution operations are introduced into the generator network. Furthermore, to alleviate the oversmoothing shortcoming of the traditional mean squared error (MSE) loss function, the perceptual similarity, which is implemented by calculating the distances in the feature space extracted by a pretrained VGG-19 network, is incorporated with the MSE and adversarial losses to form the new loss function. Extensive experiments are implemented to assess the performance of the proposed method. The experimental results show that the proposed RED-WGAN achieves performance superior to several state-of-the-art methods in both simulated and real clinical data. In particular, our method demonstrates powerful abilities in both noise suppression and structure preservation. |
Tasks | Denoising |
Published | 2018-08-12 |
URL | https://arxiv.org/abs/1808.03941v2 |
https://arxiv.org/pdf/1808.03941v2.pdf | |
PWC | https://paperswithcode.com/paper/denoising-of-3-d-magnetic-resonance-images |
Repo | https://github.com/Deep-Imaging-Group/RED-WGAN |
Framework | pytorch |
A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing
Title | A Dataset and Benchmark for Large-scale Multi-modal Face Anti-spoofing |
Authors | Shifeng Zhang, Xiaobo Wang, Ajian Liu, Chenxu Zhao, Jun Wan, Sergio Escalera, Hailin Shi, Zezheng Wang, Stan Z. Li |
Abstract | Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects ($\le\negmedspace170$) and modalities ($\leq\negmedspace2$), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and visual modalities. Specifically, it consists of $1,000$ subjects with $21,000$ videos and each sample has $3$ modalities (i.e., RGB, Depth and IR). We also provide a measurement set, evaluation protocol and training/validation/testing subsets, developing a new benchmark for face anti-spoofing. Moreover, we present a new multi-modal fusion method as baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modal. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/chalearnfacespoofingattackdete |
Tasks | Face Anti-Spoofing, Face Recognition |
Published | 2018-12-02 |
URL | http://arxiv.org/abs/1812.00408v3 |
http://arxiv.org/pdf/1812.00408v3.pdf | |
PWC | https://paperswithcode.com/paper/casia-surf-a-dataset-and-benchmark-for-large |
Repo | https://github.com/SoftwareGift/FeatherNets_Face-Anti-spoofing-Attack-Detection-Challenge-CVPR2019 |
Framework | pytorch |
Compact and Efficient Encodings for Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models
Title | Compact and Efficient Encodings for Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models |
Authors | Buser Say, Scott Sanner |
Abstract | In this paper, we leverage the efficiency of Binarized Neural Networks (BNNs) to learn complex state transition models of planning domains with discretized factored state and action spaces. In order to directly exploit this transition structure for planning, we present two novel compilations of the learned factored planning problem with BNNs based on reductions to Weighted Partial Maximum Boolean Satisfiability (FD-SAT-Plan+) as well as Binary Linear Programming (FD-BLP-Plan+). Theoretically, we show that our SAT-based Bi-Directional Neuron Activation Encoding is asymptotically the most compact encoding in the literature and maintains the generalized arc-consistency property through unit propagation – an important property that facilitates efficiency in SAT solvers. Experimentally, we validate the computational efficiency of our Bi-Directional Neuron Activation Encoding in comparison to an existing neuron activation encoding and demonstrate the effectiveness of learning complex transition models with BNNs. We test the runtime efficiency of both FD-SAT-Plan+ and FD-BLP-Plan+ on the learned factored planning problem showing that FD-SAT-Plan+ scales better with increasing BNN size and complexity. Finally, we present a finite-time incremental constraint generation algorithm based on generalized landmark constraints to improve the planning accuracy of our encodings through simulated or real-world interaction. |
Tasks | |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10433v9 |
http://arxiv.org/pdf/1811.10433v9.pdf | |
PWC | https://paperswithcode.com/paper/compact-and-efficient-encodings-for-planning |
Repo | https://github.com/saybuser/FD-SAT-Plan |
Framework | none |
Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and Thresholds
Title | Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and Thresholds |
Authors | Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin |
Abstract | In recent years, unfolding iterative algorithms as neural networks has become an empirical success in solving sparse recovery problems. However, its theoretical understanding is still immature, which prevents us from fully utilizing the power of neural networks. In this work, we study unfolded ISTA (Iterative Shrinkage Thresholding Algorithm) for sparse signal recovery. We introduce a weight structure that is necessary for asymptotic convergence to the true sparse signal. With this structure, unfolded ISTA can attain a linear convergence, which is better than the sublinear convergence of ISTA/FISTA in general cases. Furthermore, we propose to incorporate thresholding in the network to perform support selection, which is easy to implement and able to boost the convergence rate both theoretically and empirically. Extensive simulations, including sparse vector recovery and a compressive sensing experiment on real image data, corroborate our theoretical results and demonstrate their practical usefulness. We have made our codes publicly available: https://github.com/xchen-tamu/linear-lista-cpss. |
Tasks | Compressive Sensing |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.10038v2 |
http://arxiv.org/pdf/1808.10038v2.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-linear-convergence-of-unfolded |
Repo | https://github.com/TAMU-VITA/LISTA-CPSS |
Framework | tf |
Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation
Title | Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation |
Authors | Chao Li, Qiaoyong Zhong, Di Xie, Shiliang Pu |
Abstract | Skeleton-based human action recognition has recently drawn increasing attentions with the availability of large-scale skeleton datasets. The most crucial factors for this task lie in two aspects: the intra-frame representation for joint co-occurrences and the inter-frame representation for skeletons’ temporal evolutions. In this paper we propose an end-to-end convolutional co-occurrence feature learning framework. The co-occurrence features are learned with a hierarchical methodology, in which different levels of contextual information are aggregated gradually. Firstly point-level information of each joint is encoded independently. Then they are assembled into semantic representation in both spatial and temporal domains. Specifically, we introduce a global spatial aggregation scheme, which is able to learn superior joint co-occurrence features over local aggregation. Besides, raw skeleton coordinates as well as their temporal difference are integrated with a two-stream paradigm. Experiments show that our approach consistently outperforms other state-of-the-arts on action recognition and detection benchmarks like NTU RGB+D, SBU Kinect Interaction and PKU-MMD. |
Tasks | RF-based Pose Estimation, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06055v1 |
http://arxiv.org/pdf/1804.06055v1.pdf | |
PWC | https://paperswithcode.com/paper/co-occurrence-feature-learning-from-skeleton |
Repo | https://github.com/huguyuehuhu/HCN-pytorch |
Framework | pytorch |
ECG arrhythmia classification using a 2-D convolutional neural network
Title | ECG arrhythmia classification using a 2-D convolutional neural network |
Authors | Tae Joon Jun, Hoang Minh Nguyen, Daeyoun Kang, Dohyeun Kim, Daeyoung Kim, Young-Hak Kim |
Abstract | In this paper, we propose an effective electrocardiogram (ECG) arrhythmia classification method using a deep two-dimensional convolutional neural network (CNN) which recently shows outstanding performance in the field of pattern recognition. Every ECG beat was transformed into a two-dimensional grayscale image as an input data for the CNN classifier. Optimization of the proposed CNN classifier includes various deep learning techniques such as batch normalization, data augmentation, Xavier initialization, and dropout. In addition, we compared our proposed classifier with two well-known CNN models; AlexNet and VGGNet. ECG recordings from the MIT-BIH arrhythmia database were used for the evaluation of the classifier. As a result, our classifier achieved 99.05% average accuracy with 97.85% average sensitivity. To precisely validate our CNN classifier, 10-fold cross-validation was performed at the evaluation which involves every ECG recording as a test data. Our experimental results have successfully validated that the proposed CNN classifier with the transformed ECG images can achieve excellent classification accuracy without any manual pre-processing of the ECG signals such as noise filtering, feature extraction, and feature reduction. |
Tasks | Arrhythmia Detection, Data Augmentation, Electrocardiography (ECG) |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06812v1 |
http://arxiv.org/pdf/1804.06812v1.pdf | |
PWC | https://paperswithcode.com/paper/ecg-arrhythmia-classification-using-a-2-d |
Repo | https://github.com/lorenzobrusco/ECGNeuralNetwork |
Framework | tf |
Discourse Embellishment Using a Deep Encoder-Decoder Network
Title | Discourse Embellishment Using a Deep Encoder-Decoder Network |
Authors | Leonid Berov, Kai Standvoss |
Abstract | We suggest a new NLG task in the context of the discourse generation pipeline of computational storytelling systems. This task, textual embellishment, is defined by taking a text as input and generating a semantically equivalent output with increased lexical and syntactic complexity. Ideally, this would allow the authors of computational storytellers to implement just lightweight NLG systems and use a domain-independent embellishment module to translate its output into more literary text. We present promising first results on this task using LSTM Encoder-Decoder networks trained on the WikiLarge dataset. Furthermore, we introduce “Compiled Computer Tales”, a corpus of computationally generated stories, that can be used to test the capabilities of embellishment algorithms. |
Tasks | |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.08076v1 |
http://arxiv.org/pdf/1810.08076v1.pdf | |
PWC | https://paperswithcode.com/paper/discourse-embellishment-using-a-deep-encoder |
Repo | https://github.com/cartisan/CompiledComputerTales |
Framework | tf |
Non-local Meets Global: An Integrated Paradigm for Hyperspectral Denoising
Title | Non-local Meets Global: An Integrated Paradigm for Hyperspectral Denoising |
Authors | Wei He, Quanming Yao, Chao Li, Naoto Yokoya, Qibin Zhao |
Abstract | Non-local low-rank tensor approximation has been developed as a state-of-the-art method for hyperspectral image (HSI) denoising. Unfortunately, with more spectral bands for HSI, while the running time of these methods significantly increases, their denoising performance benefits little. In this paper, we claim that the HSI underlines a global spectral low-rank subspace, and the spectral subspaces of each full band patch groups should underlie this global low-rank subspace. This motivates us to propose a unified spatial-spectral paradigm for HSI denoising. As the new model is hard to optimize, we further propose an efficient algorithm for optimization, which is motivated by alternating minimization. This is done by first learning a low-dimensional projection and the related reduced image from the noisy HSI. Then, the non-local low-rank denoising and iterative regularization are developed to refine the reduced image and projection, respectively. Finally, experiments on synthetic and both real datasets demonstrate the superiority against the other state-of-the-arts HSI denoising methods. |
Tasks | Denoising |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04243v2 |
http://arxiv.org/pdf/1812.04243v2.pdf | |
PWC | https://paperswithcode.com/paper/non-local-meets-global-an-integrated-paradigm |
Repo | https://github.com/quanmingyao/NGMeet |
Framework | none |
Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds
Title | Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds |
Authors | Sheng Zhang, Kevin Duh, Benjamin Van Durme |
Abstract | Fine-grained entity typing is the task of assigning fine-grained semantic types to entity mentions. We propose a neural architecture which learns a distributional semantic representation that leverages a greater amount of semantic context – both document and sentence level information – than prior work. We find that additional context improves performance, with further improvements gained by utilizing adaptive classification thresholds. Experiments show that our approach without reliance on hand-crafted features achieves the state-of-the-art results on three benchmark datasets. |
Tasks | Entity Typing |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.08000v1 |
http://arxiv.org/pdf/1804.08000v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-entity-typing-through-increased |
Repo | https://github.com/sheng-z/figet |
Framework | pytorch |
Graph-Based Global Reasoning Networks
Title | Graph-Based Global Reasoning Networks |
Authors | Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis |
Abstract | Globally modeling and reasoning over relations between regions can be beneficial for many computer vision tasks on both images and videos. Convolutional Neural Networks (CNNs) excel at modeling local relations by convolution operations, but they are typically inefficient at capturing global relations between distant regions and require stacking multiple convolution layers. In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed. After reasoning, relation-aware features are distributed back to the original coordinate space for down-stream tasks. We further present a highly efficient instantiation of the proposed approach and introduce the Global Reasoning unit (GloRe unit) that implements the coordinate-interaction space mapping by weighted global pooling and weighted broadcasting, and the relation reasoning via graph convolution on a small graph in interaction space. The proposed GloRe unit is lightweight, end-to-end trainable and can be easily plugged into existing CNNs for a wide range of tasks. Extensive experiments show our GloRe unit can consistently boost the performance of state-of-the-art backbone architectures, including ResNet, ResNeXt, SE-Net and DPN, for both 2D and 3D CNNs, on image classification, semantic segmentation and video action recognition task. |
Tasks | Image Classification, Relational Reasoning, Semantic Segmentation, Temporal Action Localization |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12814v1 |
http://arxiv.org/pdf/1811.12814v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-global-reasoning-networks |
Repo | https://github.com/facebookresearch/GloRe |
Framework | pytorch |
Probing hidden spin order with interpretable machine learning
Title | Probing hidden spin order with interpretable machine learning |
Authors | Jonas Greitemann, Ke Liu, Lode Pollet |
Abstract | The search of unconventional magnetic and nonmagnetic states is a major topic in the study of frustrated magnetism. Canonical examples of those states include various spin liquids and spin nematics. However, discerning their existence and the correct characterization is usually challenging. Here we introduce a machine-learning protocol that can identify general nematic order and their order parameter from seemingly featureless spin configurations, thus providing comprehensive insight on the presence or absence of hidden orders. We demonstrate the capabilities of our method by extracting the analytical form of nematic order parameter tensors up to rank 6. This may prove useful in the search for novel spin states and for ruling out spurious spin liquid candidates. |
Tasks | Interpretable Machine Learning |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08557v5 |
http://arxiv.org/pdf/1804.08557v5.pdf | |
PWC | https://paperswithcode.com/paper/probing-hidden-spin-order-with-interpretable |
Repo | https://github.com/jgreitemann/svm-order-params |
Framework | none |
STAIR Actions: A Video Dataset of Everyday Home Actions
Title | STAIR Actions: A Video Dataset of Everyday Home Actions |
Authors | Yuya Yoshikawa, Jiaqing Lin, Akikazu Takeuchi |
Abstract | A new large-scale video dataset for human action recognition, called STAIR Actions is introduced. STAIR Actions contains 100 categories of action labels representing fine-grained everyday home actions so that it can be applied to research in various home tasks such as nursing, caring, and security. In STAIR Actions, each video has a single action label. Moreover, for each action category, there are around 1,000 videos that were obtained from YouTube or produced by crowdsource workers. The duration of each video is mostly five to six seconds. The total number of videos is 102,462. We explain how we constructed STAIR Actions and show the characteristics of STAIR Actions compared to existing datasets for human action recognition. Experiments with three major models for action recognition show that STAIR Actions can train large models and achieve good performance. STAIR Actions can be downloaded from http://actions.stair.center |
Tasks | Temporal Action Localization |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04326v3 |
http://arxiv.org/pdf/1804.04326v3.pdf | |
PWC | https://paperswithcode.com/paper/stair-actions-a-video-dataset-of-everyday |
Repo | https://github.com/STAIR-Lab-CIT/STAIR-actions |
Framework | pytorch |
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach
Title | Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach |
Authors | Jingjing Xu, Xu Sun, Qi Zeng, Xuancheng Ren, Xiaodong Zhang, Houfeng Wang, Wenjie Li |
Abstract | The goal of sentiment-to-sentiment “translation” is to change the underlying sentiment of a sentence while keeping its content. The main challenge is the lack of parallel data. To solve this problem, we propose a cycled reinforcement learning method that enables training on unpaired data by collaboration between a neutralization module and an emotionalization module. We evaluate our approach on two review datasets, Yelp and Amazon. Experimental results show that our approach significantly outperforms the state-of-the-art systems. Especially, the proposed method substantially improves the content preservation performance. The BLEU score is improved from 1.64 to 22.46 and from 0.56 to 14.06 on the two datasets, respectively. |
Tasks | Text Style Transfer |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05181v2 |
http://arxiv.org/pdf/1805.05181v2.pdf | |
PWC | https://paperswithcode.com/paper/unpaired-sentiment-to-sentiment-translation-a |
Repo | https://github.com/lancopku/unpaired-sentiment-translation |
Framework | tf |