October 20, 2019

3045 words 15 mins read

Paper Group AWR 203

Sequentially Aggregated Convolutional Networks. Predicting Factuality of Reporting and Bias of News Media Sources. Acute and sub-acute stroke lesion segmentation from multimodal MRI. Context-Aware Dialog Re-Ranking for Task-Oriented Dialog Systems. Detailed Human Avatars from Monocular Video. Disentangling Factors of Variation with Cycle-Consistent …

Sequentially Aggregated Convolutional Networks


Title	Sequentially Aggregated Convolutional Networks
Authors	Yiwen Huang, Rihui Wu, Pinglai Ou, Ziyong Feng
Abstract	Modern deep networks generally implement a certain form of shortcut connections to alleviate optimization difficulties. However, we observe that such network topology alters the nature of deep networks. In many ways, these networks behave similarly to aggregated wide networks. We thus exploit the aggregation nature of shortcut connections at a finer architectural level and place them within wide convolutional layers. We end up with a sequentially aggregated convolutional (SeqConv) layer that combines the benefits of both wide and deep representations by aggregating features of various depths in sequence. The proposed SeqConv serves as a drop-in replacement of regular wide convolutional layers and thus could be handily integrated into any backbone network. We apply SeqConv to widely adopted backbones including ResNet and ResNeXt, and conduct experiments for image classification on public benchmark datasets. Our ResNet based network with a model size of ResNet-50 easily surpasses the performance of the 2.35$\times$ larger ResNet-152, while our ResNeXt based model sets a new state-of-the-art accuracy on ImageNet classification for networks with similar model complexity. The code and pre-trained models of our work are publicly available at https://github.com/GroupOfAlchemists/SeqConv.
Tasks	Image Classification
Published	2018-11-27
URL	https://arxiv.org/abs/1811.10798v3
PDF	https://arxiv.org/pdf/1811.10798v3.pdf
PWC	https://paperswithcode.com/paper/a-fully-sequential-methodology-for
Repo	https://github.com/GroupOfAlchemists/SeqConv
Framework	pytorch

Predicting Factuality of Reporting and Bias of News Media Sources


Title	Predicting Factuality of Reporting and Bias of News Media Sources
Authors	Ramy Baly, Georgi Karadzhov, Dimitar Alexandrov, James Glass, Preslav Nakov
Abstract	We present a study on predicting the factuality of reporting and bias of news media. While previous work has focused on studying the veracity of claims or documents, here we are interested in characterizing entire news media. These are under-studied but arguably important research problems, both in their own right and as a prior for fact-checking systems. We experiment with a large list of news websites and with a rich set of features derived from (i) a sample of articles from the target news medium, (ii) its Wikipedia page, (iii) its Twitter account, (iv) the structure of its URL, and (v) information about the Web traffic it attracts. The experimental results show sizable performance gains over the baselines, and confirm the importance of each feature type.
Tasks
Published	2018-10-02
URL	http://arxiv.org/abs/1810.01765v1
PDF	http://arxiv.org/pdf/1810.01765v1.pdf
PWC	https://paperswithcode.com/paper/predicting-factuality-of-reporting-and-bias
Repo	https://github.com/ramybaly/News-Media-Reliability
Framework	none

Acute and sub-acute stroke lesion segmentation from multimodal MRI


Title	Acute and sub-acute stroke lesion segmentation from multimodal MRI
Authors	Albert Clèrigues, Sergi Valverde, Jose Bernal, Jordi Freixenet, Arnau Oliver, Xavier Lladó
Abstract	Acute stroke lesion segmentation tasks are of great clinical interest as they can help doctors make better informed treatment decisions. Magnetic resonance imaging (MRI) is time demanding but can provide images that are considered gold standard for diagnosis. Automated stroke lesion segmentation can provide with an estimate of the location and volume of the lesioned tissue, which can help in the clinical practice to better assess and evaluate the risks of each treatment. We propose a deep learning methodology for acute and sub-acute stroke lesion segmentation using multimodal MR imaging. The proposed method is evaluated using two public datasets from the 2015 Ischemic Stroke Lesion Segmentation challenge (ISLES 2015). These involve the tasks of sub-acute stroke lesion segmentation (SISS) and acute stroke penumbra estimation (SPES) from diffusion, perfusion and anatomical MRI modalities. The performance is compared against state-of-the-art methods with a blind online testing set evaluation on each of the challenges. At the time of submitting this manuscript, our approach is the first method in the online rankings for the SISS (DSC=0.59$\pm$0.31) and SPES sub-tasks (DSC=0.84$\pm$0.10). When compared with the rest of submitted strategies, we achieve top rank performance with a lower Hausdorff distance. Better segmentation results are obtained by leveraging the anatomy and pathophysiology of acute stroke lesions and using a combined approach to minimize the effects of class imbalance. The same training procedure is used for both tasks, showing the proposed methodology can generalize well enough to deal with different unrelated tasks and imaging modalities without training hyper-parameter tuning. A public version of the proposed method has been released to the scientific community at https://github.com/NIC-VICOROB/stroke-mri-segmentation.
Tasks	Acute Stroke Lesion Segmentation, Ischemic Stroke Lesion Segmentation, Lesion Segmentation, Outcome Prediction In Multimodal Mri
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13304v2
PDF	http://arxiv.org/pdf/1810.13304v2.pdf
PWC	https://paperswithcode.com/paper/sunet-a-deep-learning-architecture-for-acute
Repo	https://github.com/NIC-VICOROB/SUNet-architecture
Framework	tf

Context-Aware Dialog Re-Ranking for Task-Oriented Dialog Systems


Title	Context-Aware Dialog Re-Ranking for Task-Oriented Dialog Systems
Authors	Junki Ohmura, Maxine Eskenazi
Abstract	Dialog response ranking is used to rank response candidates by considering their relation to the dialog history. Although researchers have addressed this concept for open-domain dialogs, little attention has been focused on task-oriented dialogs. Furthermore, no previous studies have analyzed whether response ranking can improve the performance of existing dialog systems in real human-computer dialogs with speech recognition errors. In this paper, we propose a context-aware dialog response re-ranking system. Our system reranks responses in two steps: (1) it calculates matching scores for each candidate response and the current dialog context; (2) it combines the matching scores and a probability distribution of the candidates from an existing dialog system for response re-ranking. By using neural word embedding-based models and handcrafted or logistic regression-based ensemble models, we have improved the performance of a recently proposed end-to-end task-oriented dialog system on real dialogs with speech recognition errors.
Tasks	Speech Recognition
Published	2018-11-28
URL	http://arxiv.org/abs/1811.11430v1
PDF	http://arxiv.org/pdf/1811.11430v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-dialog-re-ranking-for-task
Repo	https://github.com/jojonki/arxiv-clip
Framework	none

Detailed Human Avatars from Monocular Video


Title	Detailed Human Avatars from Monocular Video
Authors	Thiemo Alldieck, Marcus Magnor, Weipeng Xu, Christian Theobalt, Gerard Pons-Moll
Abstract	We present a novel method for high detail-preserving human avatar creation from monocular video. A parameterized body model is refined and optimized to maximally resemble subjects from a video showing them from all sides. Our avatars feature a natural face, hairstyle, clothes with garment wrinkles, and high-resolution texture. Our paper contributes facial landmark and shading-based human body shape refinement, a semantic texture prior, and a novel texture stitching strategy, resulting in the most sophisticated-looking human avatars obtained from a single video to date. Numerous results show the robustness and versatility of our method. A user study illustrates its superiority over the state-of-the-art in terms of identity preservation, level of detail, realism, and overall user preference.
Tasks
Published	2018-08-03
URL	http://arxiv.org/abs/1808.01338v1
PDF	http://arxiv.org/pdf/1808.01338v1.pdf
PWC	https://paperswithcode.com/paper/detailed-human-avatars-from-monocular-video
Repo	https://github.com/thmoa/semantic_human_texture_stitching
Framework	none

Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders


Title	Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders
Authors	Ananya Harsh Jha, Saket Anand, Maneesh Singh, V. S. R. Veeravasarapu
Abstract	Generative models that learn disentangled representations for different factors of variation in an image can be very useful for targeted data augmentation. By sampling from the disentangled latent subspace of interest, we can efficiently generate new data necessary for a particular task. Learning disentangled representations is a challenging problem, especially when certain factors of variation are difficult to label. In this paper, we introduce a novel architecture that disentangles the latent space into two complementary subspaces by using only weak supervision in form of pairwise similarity labels. Inspired by the recent success of cycle-consistent adversarial architectures, we use cycle-consistency in a variational auto-encoder framework. Our non-adversarial approach is in contrast with the recent works that combine adversarial training with auto-encoders to disentangle representations. We show compelling results of disentangled latent subspaces on three datasets and compare with recent works that leverage adversarial training.
Tasks	Data Augmentation
Published	2018-04-27
URL	http://arxiv.org/abs/1804.10469v1
PDF	http://arxiv.org/pdf/1804.10469v1.pdf
PWC	https://paperswithcode.com/paper/disentangling-factors-of-variation-with-cycle
Repo	https://github.com/ananyahjha93/challenges-in-disentangling
Framework	pytorch

A feature agnostic approach for glaucoma detection in OCT volumes


Title	A feature agnostic approach for glaucoma detection in OCT volumes
Authors	Stefan Maetschke, Bhavna Antony, Hiroshi Ishikawa, Gadi Wollstein, Joel S. Schuman, Rahil Garnavi
Abstract	Optical coherence tomography (OCT) based measurements of retinal layer thickness, such as the retinal nerve fibre layer (RNFL) and the ganglion cell with inner plexiform layer (GCIPL) are commonly used for the diagnosis and monitoring of glaucoma. Previously, machine learning techniques have utilized segmentation-based imaging features such as the peripapillary RNFL thickness and the cup-to-disc ratio. Here, we propose a deep learning technique that classifies eyes as healthy or glaucomatous directly from raw, unsegmented OCT volumes of the optic nerve head (ONH) using a 3D Convolutional Neural Network (CNN). We compared the accuracy of this technique with various feature-based machine learning algorithms and demonstrated the superiority of the proposed deep learning based method. Logistic regression was found to be the best performing classical machine learning technique with an AUC of 0.89. In direct comparison, the deep learning approach achieved a substantially higher AUC of 0.94 with the additional advantage of providing insight into which regions of an OCT volume are important for glaucoma detection. Computing Class Activation Maps (CAM), we found that the CNN identified neuroretinal rim and optic disc cupping as well as the lamina cribrosa (LC) and its surrounding areas as the regions significantly associated with the glaucoma classification. These regions anatomically correspond to the well established and commonly used clinical markers for glaucoma diagnosis such as increased cup volume, cup diameter, and neuroretinal rim thinning at the superior and inferior segments.
Tasks
Published	2018-07-12
URL	https://arxiv.org/abs/1807.04855v4
PDF	https://arxiv.org/pdf/1807.04855v4.pdf
PWC	https://paperswithcode.com/paper/a-feature-agnostic-approach-for-glaucoma
Repo	https://github.com/yuliytsank/glaucoma-project
Framework	pytorch

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension


Title	Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Authors	Yichong Xu, Xiaodong Liu, Yelong Shen, Jingjing Liu, Jianfeng Gao
Abstract	We propose a multi-task learning framework to learn a joint Machine Reading Comprehension (MRC) model that can be applied to a wide range of MRC tasks in different domains. Inspired by recent ideas of data selection in machine translation, we develop a novel sample re-weighting scheme to assign sample-specific weights to the loss. Empirical study shows that our approach can be applied to many existing MRC models. Combined with contextual representations from pre-trained language models (such as ELMo), we achieve new state-of-the-art results on a set of MRC benchmark datasets. We release our code at https://github.com/xycforgithub/MultiTask-MRC.
Tasks	Machine Reading Comprehension, Machine Translation, Multi-Task Learning, Question Answering, Reading Comprehension
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06963v3
PDF	http://arxiv.org/pdf/1809.06963v3.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-for-machine-reading
Repo	https://github.com/kevinduh/san_mrc
Framework	pytorch

Word Embedding Perturbation for Sentence Classification


Title	Word Embedding Perturbation for Sentence Classification
Authors	Dongxu Zhang, Zhichao Yang
Abstract	In this technique report, we aim to mitigate the overfitting problem of natural language by applying data augmentation methods. Specifically, we attempt several types of noise to perturb the input word embedding, such as Gaussian noise, Bernoulli noise, and adversarial noise, etc. We also apply several constraints on different types of noise. By implementing these proposed data augmentation methods, the baseline models can gain improvements on several sentence classification tasks.
Tasks	Data Augmentation, Sentence Classification
Published	2018-04-22
URL	http://arxiv.org/abs/1804.08166v1
PDF	http://arxiv.org/pdf/1804.08166v1.pdf
PWC	https://paperswithcode.com/paper/word-embedding-perturbation-for-sentence
Repo	https://github.com/zhangdongxu/word-embedding-perturbation
Framework	tf

Discovering Bayesian Market Views for Intelligent Asset Allocation


Title	Discovering Bayesian Market Views for Intelligent Asset Allocation
Authors	Frank Z. Xing, Erik Cambria, Lorenzo Malandri, Carlo Vercellis
Abstract	Along with the advance of opinion mining techniques, public mood has been found to be a key element for stock market prediction. However, how market participants’ behavior is affected by public mood has been rarely discussed. Consequently, there has been little progress in leveraging public mood for the asset allocation problem, which is preferred in a trusted and interpretable way. In order to address the issue of incorporating public mood analyzed from social media, we propose to formalize public mood into market views, because market views can be integrated into the modern portfolio theory. In our framework, the optimal market views will maximize returns in each period with a Bayesian asset allocation model. We train two neural models to generate the market views, and benchmark the model performance on other popular asset allocation strategies. Our experimental results suggest that the formalization of market views significantly increases the profitability (5% to 10% annually) of the simulated portfolio at a given risk level.
Tasks	Opinion Mining, Stock Market Prediction
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09911v2
PDF	http://arxiv.org/pdf/1802.09911v2.pdf
PWC	https://paperswithcode.com/paper/discovering-bayesian-market-views-for
Repo	https://github.com/fxing79/ibaa
Framework	none

CNN+CNN: Convolutional Decoders for Image Captioning


Title	CNN+CNN: Convolutional Decoders for Image Captioning
Authors	Qingzhong Wang, Antoni B. Chan
Abstract	Image captioning is a challenging task that combines the field of computer vision and natural language processing. A variety of approaches have been proposed to achieve the goal of automatically describing an image, and recurrent neural network (RNN) or long-short term memory (LSTM) based models dominate this field. However, RNNs or LSTMs cannot be calculated in parallel and ignore the underlying hierarchical structure of a sentence. In this paper, we propose a framework that only employs convolutional neural networks (CNNs) to generate captions. Owing to parallel computing, our basic model is around 3 times faster than NIC (an LSTM-based model) during training time, while also providing better results. We conduct extensive experiments on MSCOCO and investigate the influence of the model width and depth. Compared with LSTM-based models that apply similar attention mechanisms, our proposed models achieves comparable scores of BLEU-1,2,3,4 and METEOR, and higher scores of CIDEr. We also test our model on the paragraph annotation dataset, and get higher CIDEr score compared with hierarchical LSTMs
Tasks	Image Captioning
Published	2018-05-23
URL	http://arxiv.org/abs/1805.09019v1
PDF	http://arxiv.org/pdf/1805.09019v1.pdf
PWC	https://paperswithcode.com/paper/cnncnn-convolutional-decoders-for-image
Repo	https://github.com/qingzwang/GHA-ImageCaptioning
Framework	pytorch

Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction


Title	Cross-relation Cross-bag Attention for Distantly-supervised Relation Extraction
Authors	Yujin Yuan, Liyuan Liu, Siliang Tang, Zhongfei Zhang, Yueting Zhuang, Shiliang Pu, Fei Wu, Xiang Ren
Abstract	Distant supervision leverages knowledge bases to automatically label instances, thus allowing us to train relation extractor without human annotations. However, the generated training data typically contain massive noise, and may result in poor performances with the vanilla supervised learning. In this paper, we propose to conduct multi-instance learning with a novel Cross-relation Cross-bag Selective Attention (C$^2$SA), which leads to noise-robust training for distant supervised relation extractor. Specifically, we employ the sentence-level selective attention to reduce the effect of noisy or mismatched sentences, while the correlation among relations were captured to improve the quality of attention weights. Moreover, instead of treating all entity-pairs equally, we try to pay more attention to entity-pairs with a higher quality. Similarly, we adopt the selective attention mechanism to achieve this goal. Experiments with two types of relation extractor demonstrate the superiority of the proposed approach over the state-of-the-art, while further ablation studies verify our intuitions and demonstrate the effectiveness of our proposed two techniques.
Tasks	Relation Extraction
Published	2018-12-27
URL	http://arxiv.org/abs/1812.10604v1
PDF	http://arxiv.org/pdf/1812.10604v1.pdf
PWC	https://paperswithcode.com/paper/cross-relation-cross-bag-attention-for
Repo	https://github.com/yuanyu255/PCNN_C2SA
Framework	pytorch

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective


Title	Revisiting the Softmax Bellman Operator: New Benefits and New Perspective
Authors	Zhao Song, Ronald E. Parr, Lawrence Carin
Abstract	The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator. Surprisingly, despite these concerns, and independent of its effect on exploration, the softmax Bellman operator when combined with Deep Q-learning, leads to Q-functions with superior policies in practice, even outperforming its double Q-learning counterpart. To better understand how and why this occurs, we revisit theoretical properties of the softmax Bellman operator, and prove that $(i)$ it converges to the standard Bellman operator exponentially fast in the inverse temperature parameter, and $(ii)$ the distance of its Q function from the optimal one can be bounded. These alone do not explain its superior performance, so we also show that the softmax operator can reduce the overestimation error, which may give some insight into why a sub-optimal operator leads to better performance in the presence of value function approximation. A comparison among different Bellman operators is then presented, showing the trade-offs when selecting them.
Tasks	Atari Games, Q-Learning
Published	2018-12-02
URL	https://arxiv.org/abs/1812.00456v2
PDF	https://arxiv.org/pdf/1812.00456v2.pdf
PWC	https://paperswithcode.com/paper/revisiting-the-softmax-bellman-operator
Repo	https://github.com/zhao-song/Softmax-DQN
Framework	none

Reinforcement Learning with A* and a Deep Heuristic


Title	Reinforcement Learning with A* and a Deep Heuristic
Authors	Ariel Keselman, Sergey Ten, Adham Ghazali, Majed Jubeh
Abstract	A* is a popular path-finding algorithm, but it can only be applied to those domains where a good heuristic function is known. Inspired by recent methods combining Deep Neural Networks (DNNs) and trees, this study demonstrates how to train a heuristic represented by a DNN and combine it with A*. This new algorithm which we call aleph-star can be used efficiently in domains where the input to the heuristic could be processed by a neural network. We compare aleph-star to N-Step Deep Q-Learning (DQN Mnih et al. 2013) in a driving simulation with pixel-based input, and demonstrate significantly better performance in this scenario.
Tasks	Q-Learning
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07745v1
PDF	http://arxiv.org/pdf/1811.07745v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-with-a-and-a-deep
Repo	https://github.com/imagry/aleph_star
Framework	pytorch

Assessing the Potential of Classical Q-learning in General Game Playing


Title	Assessing the Potential of Classical Q-learning in General Game Playing
Authors	Hui Wang, Michael Emmerich, Aske Plaat
Abstract	After the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However, deep learning is resource-intensive and the theory is not yet well developed. For small games, simple classical table-based Q-learning might still be the algorithm of choice. General Game Playing (GGP) provides a good testbed for reinforcement learning to research AGI. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee $&$ Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex)\footnote{source code: https://github.com/wh1992v/ggp-rl}, to allow comparison to Banerjee et al.. We find that Q-learning converges to a high win rate in GGP. For the $\epsilon$-greedy strategy, we propose a first enhancement, the dynamic $\epsilon$ algorithm. In addition, inspired by (Gelly $&$ Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games.
Tasks	Board Games, Q-Learning
Published	2018-10-14
URL	http://arxiv.org/abs/1810.06078v1
PDF	http://arxiv.org/pdf/1810.06078v1.pdf
PWC	https://paperswithcode.com/paper/assessing-the-potential-of-classical-q
Repo	https://github.com/wh1992v/ggp-rl
Framework	none