Paper Group ANR 543
MMFNet: A Multi-modality MRI Fusion Network for Segmentation of Nasopharyngeal Carcinoma. Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks. Unary and Binary Classification Approaches and their Implications for Authorship Verification. Natural Language Multitasking: Analyzing and Improvi …
MMFNet: A Multi-modality MRI Fusion Network for Segmentation of Nasopharyngeal Carcinoma
Title | MMFNet: A Multi-modality MRI Fusion Network for Segmentation of Nasopharyngeal Carcinoma |
Authors | Huai Chen, Yuxiao Qi, Yong Yin, Tengxiang Li, Xiaoqing Liu, Xiuli Li, Guanzhong Gong, Lisheng Wang |
Abstract | Segmentation of nasopharyngeal carcinoma (NPC) from Magnetic Resonance Images (MRI) is a crucial prerequisite for NPC radiotherapy. However, manually segmenting of NPC is time-consuming and labor-intensive. Additionally, single-modality MRI generally cannot provide enough information for its accurate delineation. Therefore, a multi-modality MRI fusion network (MMFNet) based on three modalities of MRI (T1, T2 and contrast-enhanced T1) is proposed to complete accurate segmentation of NPC. The backbone of MMFNet is designed as a multi-encoder-based network, consisting of several encoders to capture modality-specific features and one single decoder to fuse them and obtain high-level features for NPC segmentation. A fusion block is presented to effectively fuse features from multi-modality MRI. It firstly recalibrates low-level features captured from modality-specific encoders to highlight both informative features and regions of interest, then fuses weighted features by a residual fusion block to keep balance between fused ones and high-level features from decoder. Moreover, a training strategy named self-transfer, which utilizes pre-trained modality-specific encoders to initialize multi-encoder-based network, is proposed to make full mining of information from different modalities of MRI. The proposed method based on multi-modality MRI can effectively segment NPC and its advantages are validated by extensive experiments. |
Tasks | |
Published | 2018-12-25 |
URL | https://arxiv.org/abs/1812.10033v6 |
https://arxiv.org/pdf/1812.10033v6.pdf | |
PWC | https://paperswithcode.com/paper/mmfnet-a-multi-modality-mri-fusion-network |
Repo | |
Framework | |
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
Title | Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks |
Authors | Michelle A. Lee, Yuke Zhu, Krishnan Srinivasan, Parth Shah, Silvio Savarese, Li Fei-Fei, Animesh Garg, Jeannette Bohg |
Abstract | Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. However, it is non-trivial to manually design a robot controller that combines modalities with very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. We evaluate our method on a peg insertion task, generalizing over different geometry, configurations, and clearances, while being robust to external perturbations. Results for simulated and real robot experiments are presented. |
Tasks | |
Published | 2018-10-24 |
URL | http://arxiv.org/abs/1810.10191v2 |
http://arxiv.org/pdf/1810.10191v2.pdf | |
PWC | https://paperswithcode.com/paper/making-sense-of-vision-and-touch-self |
Repo | |
Framework | |
Unary and Binary Classification Approaches and their Implications for Authorship Verification
Title | Unary and Binary Classification Approaches and their Implications for Authorship Verification |
Authors | Oren Halvani, Christian Winter, Lukas Graner |
Abstract | Retrieving indexed documents, not by their topical content but their writing style opens the door for a number of applications in information retrieval (IR). One application is to retrieve textual content of a certain author X, where the queried IR system is provided beforehand with a set of reference texts of X. Authorship verification (AV), which is a research subject in the field of digital text forensics, is suitable for this purpose. The task of AV is to determine if two documents (i.e. an indexed and a reference document) have been written by the same author X. Even though AV represents a unary classification problem, a number of existing approaches consider it as a binary classification task. However, the underlying classification model of an AV method has a number of serious implications regarding its prerequisites, evaluability, and applicability. In our comprehensive literature review, we observed several misunderstandings regarding the differentiation of unary and binary AV approaches that require consideration. The objective of this paper is, therefore, to clarify these by proposing clear criteria and new properties that aim to improve the characterization of existing and future AV approaches. Given both, we investigate the applicability of eleven existing unary and binary AV methods as well as four generic unary classification algorithms on two self-compiled corpora. Furthermore, we highlight an important issue concerning the evaluation of AV methods based on fixed decision criterions, which has not been paid attention in previous AV studies. |
Tasks | Information Retrieval |
Published | 2018-12-31 |
URL | http://arxiv.org/abs/1901.00399v1 |
http://arxiv.org/pdf/1901.00399v1.pdf | |
PWC | https://paperswithcode.com/paper/unary-and-binary-classification-approaches |
Repo | |
Framework | |
Natural Language Multitasking: Analyzing and Improving Syntactic Saliency of Hidden Representations
Title | Natural Language Multitasking: Analyzing and Improving Syntactic Saliency of Hidden Representations |
Authors | Gino Brunner, Yuyi Wang, Roger Wattenhofer, Michael Weigelt |
Abstract | We train multi-task autoencoders on linguistic tasks and analyze the learned hidden sentence representations. The representations change significantly when translation and part-of-speech decoders are added. The more decoders a model employs, the better it clusters sentences according to their syntactic similarity, as the representation space becomes less entangled. We explore the structure of the representation space by interpolating between sentences, which yields interesting pseudo-English sentences, many of which have recognizable syntactic structure. Lastly, we point out an interesting property of our models: The difference-vector between two sentences can be added to change a third sentence with similar features in a meaningful way. |
Tasks | |
Published | 2018-01-18 |
URL | http://arxiv.org/abs/1801.06024v1 |
http://arxiv.org/pdf/1801.06024v1.pdf | |
PWC | https://paperswithcode.com/paper/natural-language-multitasking-analyzing-and |
Repo | |
Framework | |
How You See Me
Title | How You See Me |
Authors | Rohit Gandikota, Deepak Mishra |
Abstract | Convolution Neural Networks is one of the most powerful tools in the present era of science. There has been a lot of research done to improve their performance and robustness while their internal working was left unexplored to much extent. They are often defined as black boxes that can map non-linear data very effectively. This paper tries to show how CNN has learned to look at an image. The proposed algorithm exploits the basic math of CNN to backtrack the important pixels it is considering to predict. This is a simple algorithm which does not involve any training of its own over a pre-trained CNN which can classify. |
Tasks | |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08152v1 |
http://arxiv.org/pdf/1811.08152v1.pdf | |
PWC | https://paperswithcode.com/paper/how-you-see-me |
Repo | |
Framework | |
Escaping Saddle Points in Constrained Optimization
Title | Escaping Saddle Points in Constrained Optimization |
Authors | Aryan Mokhtari, Asuman Ozdaglar, Ali Jadbabaie |
Abstract | In this paper, we study the problem of escaping from saddle points in smooth nonconvex optimization problems subject to a convex set $\mathcal{C}$. We propose a generic framework that yields convergence to a second-order stationary point of the problem, if the convex set $\mathcal{C}$ is simple for a quadratic objective function. Specifically, our results hold if one can find a $\rho$-approximate solution of a quadratic program subject to $\mathcal{C}$ in polynomial time, where $\rho<1$ is a positive constant that depends on the structure of the set $\mathcal{C}$. Under this condition, we show that the sequence of iterates generated by the proposed framework reaches an $(\epsilon,\gamma)$-second order stationary point (SOSP) in at most $\mathcal{O}(\max{\epsilon^{-2},\rho^{-3}\gamma^{-3}})$ iterations. We further characterize the overall complexity of reaching an SOSP when the convex set $\mathcal{C}$ can be written as a set of quadratic constraints and the objective function Hessian has a specific structure over the convex set $\mathcal{C}$. Finally, we extend our results to the stochastic setting and characterize the number of stochastic gradient and Hessian evaluations to reach an $(\epsilon,\gamma)$-SOSP. |
Tasks | |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02162v2 |
http://arxiv.org/pdf/1809.02162v2.pdf | |
PWC | https://paperswithcode.com/paper/escaping-saddle-points-in-constrained |
Repo | |
Framework | |
Pyramid Person Matching Network for Person Re-identification
Title | Pyramid Person Matching Network for Person Re-identification |
Authors | Chaojie Mao, Yingming Li, Zhongfei Zhang, Yaqing Zhang, Xi Li |
Abstract | In this work, we present a deep convolutional pyramid person matching network (PPMN) with specially designed Pyramid Matching Module to address the problem of person re-identification. The architecture takes a pair of RGB images as input, and outputs a similiarity value indicating whether the two input images represent the same person or not. Based on deep convolutional neural networks, our approach first learns the discriminative semantic representation with the semantic-component-aware features for persons and then employs the Pyramid Matching Module to match the common semantic-components of persons, which is robust to the variation of spatial scales and misalignment of locations posed by viewpoint changes. The above two processes are jointly optimized via a unified end-to-end deep learning scheme. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our approach against the state-of-the-art approaches, especially on the rank-1 recognition rate. |
Tasks | Person Re-Identification |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02547v1 |
http://arxiv.org/pdf/1803.02547v1.pdf | |
PWC | https://paperswithcode.com/paper/pyramid-person-matching-network-for-person-re |
Repo | |
Framework | |
Robust Maximization of Non-Submodular Objectives
Title | Robust Maximization of Non-Submodular Objectives |
Authors | Ilija Bogunovic, Junyao Zhao, Volkan Cevher |
Abstract | We study the problem of maximizing a monotone set function subject to a cardinality constraint $k$ in the setting where some number of elements $\tau$ is deleted from the returned set. The focus of this work is on the worst-case adversarial setting. While there exist constant-factor guarantees when the function is submodular, there are no guarantees for non-submodular objectives. In this work, we present a new algorithm Oblivious-Greedy and prove the first constant-factor approximation guarantees for a wider class of non-submodular objectives. The obtained theoretical bounds are the first constant-factor bounds that also hold in the linear regime, i.e. when the number of deletions $\tau$ is linear in $k$. Our bounds depend on established parameters such as the submodularity ratio and some novel ones such as the inverse curvature. We bound these parameters for two important objectives including support selection and variance reduction. Finally, we numerically demonstrate the robust performance of Oblivious-Greedy for these two objectives on various datasets. |
Tasks | |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.07073v2 |
http://arxiv.org/pdf/1802.07073v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-maximization-of-non-submodular |
Repo | |
Framework | |
Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification
Title | Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification |
Authors | Zilong Zhong, Jonathan Li |
Abstract | High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training. |
Tasks | Hyperspectral Image Classification, Image Classification |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1802.03495v1 |
http://arxiv.org/pdf/1802.03495v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-networks-and-1 |
Repo | |
Framework | |
Ontology Alignment in the Biomedical Domain Using Entity Definitions and Context
Title | Ontology Alignment in the Biomedical Domain Using Entity Definitions and Context |
Authors | Lucy Lu Wang, Chandra Bhagavatula, Mark Neumann, Kyle Lo, Chris Wilhelm, Waleed Ammar |
Abstract | Ontology alignment is the task of identifying semantically equivalent entities from two given ontologies. Different ontologies have different representations of the same entity, resulting in a need to de-duplicate entities when merging ontologies. We propose a method for enriching entities in an ontology with external definition and context information, and use this additional information for ontology alignment. We develop a neural architecture capable of encoding the additional information when available, and show that the addition of external data results in an F1-score of 0.69 on the Ontology Alignment Evaluation Initiative (OAEI) largebio SNOMED-NCI subtask, comparable with the entity-level matchers in a SOTA system. |
Tasks | |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07976v1 |
http://arxiv.org/pdf/1806.07976v1.pdf | |
PWC | https://paperswithcode.com/paper/ontology-alignment-in-the-biomedical-domain |
Repo | |
Framework | |
Multiagent Soft Q-Learning
Title | Multiagent Soft Q-Learning |
Authors | Ermo Wei, Drew Wicke, David Freelan, Sean Luke |
Abstract | Policy gradient methods are often applied to reinforcement learning in continuous multiagent games. These methods perform local search in the joint-action space, and as we show, they are susceptable to a game-theoretic pathology known as relative overgeneralization. To resolve this issue, we propose Multiagent Soft Q-learning, which can be seen as the analogue of applying Q-learning to continuous controls. We compare our method to MADDPG, a state-of-the-art approach, and show that our method achieves better coordination in multiagent cooperative tasks, converging to better local optima in the joint action space. |
Tasks | Policy Gradient Methods, Q-Learning |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09817v1 |
http://arxiv.org/pdf/1804.09817v1.pdf | |
PWC | https://paperswithcode.com/paper/multiagent-soft-q-learning |
Repo | |
Framework | |
Zipf’s law in 50 languages: its structural pattern, linguistic interpretation, and cognitive motivation
Title | Zipf’s law in 50 languages: its structural pattern, linguistic interpretation, and cognitive motivation |
Authors | Shuiyuan Yu, Chunshan Xu, Haitao Liu |
Abstract | Zipf’s law has been found in many human-related fields, including language, where the frequency of a word is persistently found as a power law function of its frequency rank, known as Zipf’s law. However, there is much dispute whether it is a universal law or a statistical artifact, and little is known about what mechanisms may have shaped it. To answer these questions, this study conducted a large scale cross language investigation into Zipf’s law. The statistical results show that Zipf’s laws in 50 languages all share a 3-segment structural pattern, with each segment demonstrating distinctive linguistic properties and the lower segment invariably bending downwards to deviate from theoretical expectation. This finding indicates that this deviation is a fundamental and universal feature of word frequency distributions in natural languages, not the statistical error of low frequency words. A computer simulation based on the dual-process theory yields Zipf’s law with the same structural pattern, suggesting that Zipf’s law of natural languages are motivated by common cognitive mechanisms. These results show that Zipf’s law in languages is motivated by cognitive mechanisms like dual-processing that govern human verbal behaviors. |
Tasks | |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.01855v1 |
http://arxiv.org/pdf/1807.01855v1.pdf | |
PWC | https://paperswithcode.com/paper/zipfs-law-in-50-languages-its-structural |
Repo | |
Framework | |
Semi-Semantic Line-Cluster Assisted Monocular SLAM for Indoor Environments
Title | Semi-Semantic Line-Cluster Assisted Monocular SLAM for Indoor Environments |
Authors | Ting Sun, Dezhen Song, Dit-Yan Yeung, Ming Liu |
Abstract | This paper presents a novel method to reduce the scale drift for indoor monocular simultaneous localization and mapping (SLAM). We leverage the prior knowledge that in the indoor environment, the line segments form tight clusters, e.g. many door frames in a straight corridor are of the same shape, size and orientation, so the same edges of these door frames form a tight line segment cluster. We implement our method in the popular ORB-SLAM2, which also serves as our baseline. In the front end we detect the line segments in each frame and incrementally cluster them in the 3D space. In the back end, we optimize the map imposing the constraint that the line segments of the same cluster should be the same. Experimental results show that our proposed method successfully reduces the scale drift for indoor monocular SLAM. |
Tasks | Simultaneous Localization and Mapping |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01592v1 |
http://arxiv.org/pdf/1811.01592v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-semantic-line-cluster-assisted-monocular |
Repo | |
Framework | |
Action Anticipation By Predicting Future Dynamic Images
Title | Action Anticipation By Predicting Future Dynamic Images |
Authors | Cristian Rodriguez, Basura Fernando, Hongdong Li |
Abstract | Human action-anticipation methods predict what is the future action by observing only a few portion of an action in progress. This is critical for applications where computers have to react to human actions as early as possible such as autonomous driving, human-robotic interaction, assistive robotics among others. In this paper, we present a method for human action anticipation by predicting the most plausible future human motion. We represent human motion using Dynamic Images and make use of tailored loss functions to encourage a generative model to produce accurate future motion prediction. Our method outperforms the currently best performing action-anticipation methods by 4% on JHMDB-21, 5.2% on UT-Interaction and 5.1% on UCF 101-24 benchmarks. |
Tasks | Autonomous Driving, motion prediction |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00141v1 |
http://arxiv.org/pdf/1808.00141v1.pdf | |
PWC | https://paperswithcode.com/paper/action-anticipation-by-predicting-future |
Repo | |
Framework | |
Simultaneous Localization and Layout Model Selection in Manhattan Worlds
Title | Simultaneous Localization and Layout Model Selection in Manhattan Worlds |
Authors | Armon Shariati, Bernd Pfrommer, Camillo J. Taylor |
Abstract | In this paper, we will demonstrate how Manhattan structure can be exploited to transform the Simultaneous Localization and Mapping (SLAM) problem, which is typically solved by a nonlinear optimization over feature positions, into a model selection problem solved by a convex optimization over higher order layout structures, namely walls, floors, and ceilings. Furthermore, we show how our novel formulation leads to an optimization procedure that automatically performs data association and loop closure and which ultimately produces the simplest model of the environment that is consistent with the available measurements. We verify our method on real world data sets collected with various sensing modalities. |
Tasks | Model Selection, Simultaneous Localization and Mapping |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.04135v3 |
http://arxiv.org/pdf/1809.04135v3.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-localization-and-layout-model |
Repo | |
Framework | |