July 26, 2019

3060 words 15 mins read

Paper Group NAWR 13

Co-attending Regions and Detections with Multi-modal Multiplicative Embedding for VQA. Jmp8 at SemEval-2017 Task 2: A simple and general distributional approach to estimate word similarity. ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation. Evaluation Metrics for Automatically Generated Metaphorical Expressions. MR-b …


Title	Co-attending Regions and Detections with Multi-modal Multiplicative Embedding for VQA
Authors	Lu, Pan; Li, Hongsheng; Zhang, Wei; Wang, Jianyong; Wang, Xiaogang
Abstract	Recently, the Visual Question Answering (VQA) task has gained increasing attention in artificial intelligence. Existing VQA methods mainly adopt the visual attention mechanism to associate the input question with corresponding image regions for effective question answering. The free-form region based and the detection-based visual attention mechanisms are mostly investigated, with the former ones attending free-form image regions and the latter ones attending pre-specified detection-box regions. We argue that the two attention mechanisms are able to provide complementary information and should be effectively integrated to better solve the VQA problem. In this paper, we propose a novel deep neural network for VQA that integrates both attention mechanisms. Our proposed framework effectively fuses features from free-form image regions, detection boxes, and question representations via a multi-modal multiplicative feature embedding scheme to jointly attend question-related free-form image regions and detection boxes for more accurate question answering. The proposed method is extensively evaluated on two publicly available datasets, COCO-QA and VQA, and outperforms state-of-the-art approaches. Source code is available at https://github.com/lupantech/dual-mfa-vqa.
Tasks	Question Answering, Visual Question Answering
Published	2017-11-18
URL	https://github.com/lupantech/dual-mfa-vqa
PDF	https://lupantech.github.io/papers/aaa18_dualvqa.pdf
PWC	https://paperswithcode.com/paper/co-attending-regions-and-detections-with
Repo	https://github.com/lupantech/dual-mfa-vqa
Framework	torch

Jmp8 at SemEval-2017 Task 2: A simple and general distributional approach to estimate word similarity


Title	Jmp8 at SemEval-2017 Task 2: A simple and general distributional approach to estimate word similarity
Authors	Josu{'e} Melka, Gilles Bernard
Abstract	We have built a simple corpus-based system to estimate words similarity in multiple languages with a count-based approach. After training on Wikipedia corpora, our system was evaluated on the multilingual subtask of SemEval-2017 Task 2 and achieved a good level of performance, despite its great simplicity. Our results tend to demonstrate the power of the distributional approach in semantic similarity tasks, even without knowledge of the underlying language. We also show that dimensionality reduction has a considerable impact on the results.
Tasks	Dimensionality Reduction, Semantic Similarity, Semantic Textual Similarity
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2035/
PDF	https://www.aclweb.org/anthology/S17-2035
PWC	https://paperswithcode.com/paper/jmp8-at-semeval-2017-task-2-a-simple-and
Repo	https://github.com/yoch/jmp8
Framework	none

ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation


Title	ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation
Authors	E. Romera, J. M. Alvarez, L. M. Bergasa and R. Arroyo
Abstract	Semantic segmentation is a challenging task that addresses most of the perception needs of Intelligent Vehicles (IV) in an unified way. Deep Neural Networks excel at this task, as they can be trained end-to-end to accurately classify multiple object categories in an image at pixel level. However, a good trade-off between high quality and computational resources is yet not present in state-of-the-art semantic segmentation approaches, limiting their application in real vehicles. In this paper, we propose a deep architecture that is able to run in real-time while providing accurate semantic segmentation. The core of our architecture is a novel layer that uses residual connections and factorized convolutions in order to remain efficient while retaining remarkable accuracy. Our approach is able to run at over 83 FPS in a single Titan X, and 7 FPS in a Jetson TX1 (embedded GPU). A comprehensive set of experiments on the publicly available Cityscapes dataset demonstrates that our system achieves an accuracy that is similar to the state of the art, while being orders of magnitude faster to compute than other architectures that achieve top precision. The resulting trade-off makes our model an ideal approach for scene understanding in IV applications. The code is publicly available at: https://github.com/Eromera/erfnet
Tasks	Real-Time Semantic Segmentation, Scene Understanding, Semantic Segmentation
Published	2017-10-09
URL	https://ieeexplore.ieee.org/abstract/document/8063438
PDF	http://www.robesafe.uah.es/personal/eduardo.romera/pdfs/Romera17tits.pdf
PWC	https://paperswithcode.com/paper/erfnet-efficient-residual-factorized-convnet
Repo	https://github.com/Eromera/erfnet_pytorch
Framework	pytorch

Evaluation Metrics for Automatically Generated Metaphorical Expressions


Title	Evaluation Metrics for Automatically Generated Metaphorical Expressions
Authors	Akira Miyazawa, Yusuke Miyao
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-6929/
PDF	https://www.aclweb.org/anthology/W17-6929
PWC	https://paperswithcode.com/paper/evaluation-metrics-for-automatically
Repo	https://github.com/pecorarista/metaphor-evaluation-result
Framework	none

MR-based synthetic CT generation using a deep convolutional neural network method


Title	MR-based synthetic CT generation using a deep convolutional neural network method
Authors	Xiao Han
Abstract	Purpose Interests have been rapidly growing in the field of radiotherapy to replace CT with magnetic resonance imaging (MRI), due to superior soft tissue contrast offered by MRI and the desire to reduce unnecessary radiation dose. MR‐only radiotherapy also simplifies clinical workflow and avoids uncertainties in aligning MR with CT. Methods, however, are needed to derive CT‐equivalent representations, often known as synthetic CT (sCT), from patient MR images for dose calculation and DRR‐based patient positioning. Synthetic CT estimation is also important for PET attenuation correction in hybrid PET‐MR systems. We propose in this work a novel deep convolutional neural network (DCNN) method for sCT generation and evaluate its performance on a set of brain tumor patient images. Methods The proposed method builds upon recent developments of deep learning and convolutional neural networks in the computer vision literature. The proposed DCNN model has 27 convolutional layers interleaved with pooling and unpooling layers and 35 million free parameters, which can be trained to learn a direct end‐to‐end mapping from MR images to their corresponding CTs. Training such a large model on our limited data is made possible through the principle of transfer learning and by initializing model weights from a pretrained model. Eighteen brain tumor patients with both CT and T1‐weighted MR images are used as experimental data and a sixfold cross‐validation study is performed. Each sCT generated is compared against the real CT image of the same patient on a voxel‐by‐voxel basis. Comparison is also made with respect to an atlas‐based approach that involves deformable atlas registration and patch‐based atlas fusion. Results The proposed DCNN method produced a mean absolute error (MAE) below 85 HU for 13 of the 18 test subjects. The overall average MAE was 84.8 ± 17.3 HU for all subjects, which was found to be significantly better than the average MAE of 94.5 ± 17.8 HU for the atlas‐based method. The DCNN method also provided significantly better accuracy when being evaluated using two other metrics: the mean squared error (188.6 ± 33.7 versus 198.3 ± 33.0) and the Pearson correlation coefficient(0.906 ± 0.03 versus 0.896 ± 0.03). Although training a DCNN model can be slow, training only need be done once. Applying a trained model to generate a complete sCT volume for each new patient MR image only took 9 s, which was much faster than the atlas‐based approach. Conclusions A DCNN model method was developed, and shown to be able to produce highly accurate sCT estimations from conventional, single‐sequence MR images in near real time. Quantitative results also showed that the proposed method competed favorably with an atlas‐based method, in terms of both accuracy and computation speed at test time. Further validation on dose computation accuracy and on a larger patient cohort is warranted. Extensions of the method are also possible to further improve accuracy or to handle multi‐sequence MR images.
Tasks	Transfer Learning
Published	2017-02-13
URL	https://aapm.onlinelibrary.wiley.com/doi/full/10.1002/mp.12155
PDF	https://aapm.onlinelibrary.wiley.com/doi/full/10.1002/mp.12155
PWC	https://paperswithcode.com/paper/mr-based-synthetic-ct-generation-using-a-deep
Repo	https://github.com/ChengBinJin/MRI-to-CT-DCNN-TensorFlow
Framework	tf

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework


Title	beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
Authors	Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, Alexander Lerchner
Abstract	Learning an interpretable factorised representation of the independent data generative factors of the world without supervision is an important precursor for the development of artificial intelligence that is able to learn and reason in the same way that humans do. We introduce beta-VAE, a new state-of-the-art framework for automated discovery of interpretable factorised latent representations from raw image data in a completely unsupervised manner. Our approach is a modification of the variational autoencoder (VAE) framework. We introduce an adjustable hyperparameter beta that balances latent channel capacity and independence constraints with reconstruction accuracy. We demonstrate that beta-VAE with appropriately tuned beta > 1 qualitatively outperforms VAE (beta = 1), as well as state of the art unsupervised (InfoGAN) and semi-supervised (DC-IGN) approaches to disentangled factor learning on a variety of datasets (celebA, faces and chairs). Furthermore, we devise a protocol to quantitatively compare the degree of disentanglement learnt by different models, and show that our approach also significantly outperforms all baselines quantitatively. Unlike InfoGAN, beta-VAE is stable to train, makes few assumptions about the data and relies on tuning a single hyperparameter, which can be directly optimised through a hyper parameter search using weakly labelled data or through heuristic visual inspection for purely unsupervised data.
Tasks
Published	2017-04-26
URL	https://openreview.net/forum?id=Sy2fzU9gl
PDF	https://pdfs.semanticscholar.org/a902/26c41b79f8b06007609f39f82757073641e2.pdf
PWC	https://paperswithcode.com/paper/beta-vae-learning-basic-visual-concepts-with
Repo	https://github.com/AntixK/PyTorch-VAE
Framework	pytorch

Automatic Discovery of the Statistical Types of Variables in a Dataset


Title	Automatic Discovery of the Statistical Types of Variables in a Dataset
Authors	Isabel Valera, Zoubin Ghahramani
Abstract	A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. Data are often heterogeneous, complex, and improperly or incompletely documented. Surprisingly, despite their practical importance, there is still a lack of tools to automatically discover the statistical types of, as well as appropriate likelihood (noise) models for, the variables in a dataset. In this paper, we fill this gap by proposing a Bayesian method, which accurately discovers the statistical data types in both synthetic and real data.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=541
PDF	http://proceedings.mlr.press/v70/valera17a/valera17a.pdf
PWC	https://paperswithcode.com/paper/automatic-discovery-of-the-statistical-types
Repo	https://github.com/ivaleraM/DataTypes
Framework	none

A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates


Title	A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates
Authors	Pepa Gencheva, Preslav Nakov, Llu{'\i}s M{`a}rquez, Alberto Barr{'o}n-Cede{~n}o, Ivan Koychev
Abstract	In the context of investigative journalism, we address the problem of automatically identifying which claims in a given document are most worthy and should be prioritized for fact-checking. Despite its importance, this is a relatively understudied problem. Thus, we create a new corpus of political debates, containing statements that have been fact-checked by nine reputable sources, and we train machine learning models to predict which claims should be prioritized for fact-checking, i.e., we model the problem as a ranking task. Unlike previous work, which has looked primarily at sentences in isolation, in this paper we focus on a rich input representation modeling the context: relationship between the target statement and the larger context of the debate, interaction between the opponents, and reaction by the moderator and by the public. Our experiments show state-of-the-art results, outperforming a strong rivaling system by a margin, while also confirming the importance of the contextual information.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1037/
PDF	https://doi.org/10.26615/978-954-452-049-6_037
PWC	https://paperswithcode.com/paper/a-context-aware-approach-for-detecting-worth
Repo	https://github.com/pgencheva/claim-rank
Framework	none

Phonemic Transcription of Low-Resource Tonal Languages


Title	Phonemic Transcription of Low-Resource Tonal Languages
Authors	Oliver Adams, Trevor Cohn, Graham Neubig, Alexis Michaud
Abstract
Tasks	Acoustic Modelling, Language Modelling, Speech Recognition
Published	2017-12-01
URL	https://www.aclweb.org/anthology/U17-1006/
PDF	https://www.aclweb.org/anthology/U17-1006
PWC	https://paperswithcode.com/paper/phonemic-transcription-of-low-resource-tonal
Repo	https://github.com/oadams/mam
Framework	tf

SST: Single-Stream Temporal Action Proposals


Title	SST: Single-Stream Temporal Action Proposals
Authors	Shyamal Buch, Victor Escorcia, Chuanqi Shen, Bernard Ghanem, Juan Carlos Niebles
Abstract	Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences. We introduce Single-Stream Temporal Action Proposals (SST), a new effective and efficient deep architecture for the generation of temporal action proposals. Our network can run continuously in a single stream over very long input video sequences, without the need to divide input into short overlapping clips or temporal windows for batch processing. We demonstrate empirically that our model outperforms the state-of-the-art on the task of temporal action proposal generation, while achieving some of the fastest processing speeds in the literature. Finally, we demonstrate that using SST proposals in conjunction with existing action classifiers results in improved state-of-the-art temporal action detection performance.
Tasks	Action Detection, Temporal Action Proposal Generation
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Buch_SST_Single-Stream_Temporal_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Buch_SST_Single-Stream_Temporal_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/sst-single-stream-temporal-action-proposals
Repo	https://github.com/shyamal-b/sst
Framework	none


Title	A Two-Layer Dialogue Framework For Authoring Social Bots
Authors	Jieming Ji, Qingyun Wang, Zev Battad, Jiashun Gou, Jingfei Zhou, Rahul Divekar, Craig Carlson, Mei Si
Abstract	In this work, we explored creating a social bot for casual conversations. One of the major challenges in designing social bots is how to keep the user engaged. We experimented with a range of conversational activities, such as providing news and playing games, and strategies for controlling the dialogue flow. To support these experiments, we proposed a two-layer dialogue framework which allows for flexible reuse and reorganization of individual dialogue modules. The chat-bot was deployed as an Amazon Alexa Skill, and participated the Alexa social bot competition. Over 20k Alexa users interacted with and rated our bot between 4/1/2017 and 8/26/2017. We found that in general supporting a richer set of conversational activities is desirable, and the users are more in favor of having natural conversations over menu-based conversations. Our results also indicate that the lengths of interactions with the entertainment-oriented modules positively correspond to the users’ ratings of the bot. In Contrast, for modules that serve as an information provider, ie, news and news comments the lengths of the interactions do not predict the ratings.
Tasks	Chatbot
Published	2017-11-01
URL	https://m.media-amazon.com/images/G/01/mobile-apps/dex/alexa/alexaprize/assets/pdf/2017/Wisemacaw.pdf
PDF	https://m.media-amazon.com/images/G/01/mobile-apps/dex/alexa/alexaprize/assets/pdf/2017/Wisemacaw.pdf
PWC	https://paperswithcode.com/paper/a-two-layer-dialogue-framework-for-authoring
Repo	https://github.com/dk000000000/WiseMacawAI
Framework	none

A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning


Title	A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning
Authors	Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang
Abstract	Deep Auto-Encoder (DAE) has shown its promising power in high-level representation learning. From the perspective of manifold learning, we propose a graph regularized deep neural network (GR-DNN) to endue traditional DAEs with the ability of retaining local geometric structure. A deep-structured regularizer is formulated upon multi-layer perceptions to capture this structure. The robust and discriminative embedding space is learned to simultaneously preserve the high-level semantics and the geometric structure within local manifold tangent space. Theoretical analysis presents the close relationship between the proposed graph regularizer and the graph Laplacian regularizer in terms of the optimization objective. We also alleviate the growth of the network complexity by introducing the anchor-based bipartite graph, which guarantees the good scalability for large scale data. The experiments on four datasets show the comparable results of the proposed GR-DNN with the state-of-the-art methods.
Tasks	Representation Learning
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Yang_A_Graph_Regularized_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Yang_A_Graph_Regularized_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/a-graph-regularized-deep-neural-network-for
Repo	https://github.com/ysjakking/GR-DNN
Framework	none

WAT-SL: A Customizable Web Annotation Tool for Segment Labeling


Title	WAT-SL: A Customizable Web Annotation Tool for Segment Labeling
Authors	Johannes Kiesel, Henning Wachsmuth, Khalid Al-Khatib, Benno Stein
Abstract	A frequent type of annotations in text corpora are labeled text segments. General-purpose annotation tools tend to be overly comprehensive, often making the annotation process slower and more error-prone. We present WAT-SL, a new web-based tool that is dedicated to segment labeling and highly customizable to the labeling task at hand. We outline its main features and exemplify how we used it for a crowdsourced corpus with labeled argument units.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-3004/
PDF	https://www.aclweb.org/anthology/E17-3004
PWC	https://paperswithcode.com/paper/wat-sl-a-customizable-web-annotation-tool-for
Repo	https://github.com/webis-de/wat
Framework	none

Cost efficient gradient boosting


Title	Cost efficient gradient boosting
Authors	Sven Peter, Ferran Diego, Fred A. Hamprecht, Boaz Nadler
Abstract	Many applications require learning classifiers or regressors that are both accurate and cheap to evaluate. Prediction cost can be drastically reduced if the learned predictor is constructed such that on the majority of the inputs, it uses cheap features and fast evaluations. The main challenge is to do so with little loss in accuracy. In this work we propose a budget-aware strategy based on deep boosted regression trees. In contrast to previous approaches to learning with cost penalties, our method can grow very deep trees that on average are nonetheless cheap to compute. We evaluate our method on a number of datasets and find that it outperforms the current state of the art by a large margin. Our algorithm is easy to implement and its learning time is comparable to that of the original gradient boosting. Source code is made available at http://github.com/svenpeter42/LightGBM-CEGB.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6753-cost-efficient-gradient-boosting
PDF	http://papers.nips.cc/paper/6753-cost-efficient-gradient-boosting.pdf
PWC	https://paperswithcode.com/paper/cost-efficient-gradient-boosting
Repo	https://github.com/svenpeter42/LightGBM-CEGB
Framework	none

Meet Spinky: An Open-Source Spindle and K-Complex Detection Toolbox Validated on the Open-Access Montreal Archive of Sleep Studies (MASS).


Title	Meet Spinky: An Open-Source Spindle and K-Complex Detection Toolbox Validated on the Open-Access Montreal Archive of Sleep Studies (MASS).
Authors	Tarek Lajnef, Christian O’Reilly, Etienne Combrisson, Etienne Combrisson, Sahbi Ch1aibi, Jb Eichenlaub, Perrine Ruby, Pierre-Emanuel Aguera, Mounir Samet, Abdennaceur Kachouri, Sonia Frenette, Julie Carrier, Karim Jerbi
Abstract	Sleep spindles and K-complexes are among the most prominent micro-events observed in electroencephalographic (EEG) recordings during sleep. These EEG microstructures are thought to be hallmarks of sleep-related cognitive processes. Although tedious and time-consuming, their identification and quantification is important for sleep studies in both healthy subjects and patients with sleep disorders. Therefore, procedures for automatic detection of spindles and K-complexes could provide valuable assistance to researchers and clinicians in the field. Recently, we proposed a framework for joint spindle and K-complex detection (Lajnef et al., 2015a) based on a Tunable Q-factor Wavelet Transform (TQWT; Selesnick, 2011a) and morphological component analysis (MCA). Using a wide range of performance metrics, the present article provides critical validation and benchmarking of the proposed approach by applying it to open-access EEG data from the Montreal Archive of Sleep Studies (MASS; O’Reilly et al., 2014). Importantly, the obtained scores were compared to alternative methods that were previously tested on the same database. With respect to spindle detection, our method achieved higher performance than most of the alternative methods. This was corroborated with statistic tests that took into account both sensitivity and precision (i.e., Matthew’s coefficient of correlation (MCC), F1, Cohen κ). Our proposed method has been made available to the community via an open-source tool named Spinky (for spindle and K-complex detection). Thanks to a GUI implementation and access to Matlab and Python resources, Spinky is expected to contribute to an open-science approach that will enhance replicability and reliable comparisons of classifier performances for the detection of sleep EEG microstructure in both healthy and patient populations.
Tasks	EEG, K-complex detection, Spindle Detection
Published	2017-03-02
URL	https://doi.org/10.3389/fninf.2017.00015
PDF	https://www.frontiersin.org/articles/10.3389/fninf.2017.00015/pdf
PWC	https://paperswithcode.com/paper/meet-spinky-an-open-source-spindle-and-k
Repo	https://github.com/TarekLaj/SPINKY
Framework	none