Paper Group ANR 150
CT synthesis from MR images for orthopedic applications in the lower arm using a conditional generative adversarial network. Mix and Match: Markov Chains & Mixing Times for Matching in Rideshare. To complete or to estimate, that is the question: A Multi-Task Approach to Depth Completion and Monocular Depth Estimation. Option Comparison Network for …
CT synthesis from MR images for orthopedic applications in the lower arm using a conditional generative adversarial network
Title | CT synthesis from MR images for orthopedic applications in the lower arm using a conditional generative adversarial network |
Authors | Frank Zijlstra, Koen Willemsen, Mateusz C. Florkow, Ralph J. B. Sakkers, Harrie H. Weinans, Bart C. H. van der Wal, Marijn van Stralen, Peter R. Seevinck |
Abstract | Purpose: To assess the feasibility of deep learning-based high resolution synthetic CT generation from MRI scans of the lower arm for orthopedic applications. Methods: A conditional Generative Adversarial Network was trained to synthesize CT images from multi-echo MR images. A training set of MRI and CT scans of 9 ex vivo lower arms was acquired and the CT images were registered to the MRI images. Three-fold cross-validation was applied to generate independent results for the entire dataset. The synthetic CT images were quantitatively evaluated with the mean absolute error metric, and Dice similarity and surface to surface distance on cortical bone segmentations. Results: The mean absolute error was 63.5 HU on the overall tissue volume and 144.2 HU on the cortical bone. The mean Dice similarity of the cortical bone segmentations was 0.86. The average surface to surface distance between bone on real and synthetic CT was 0.48 mm. Qualitatively, the synthetic CT images corresponded well with the real CT scans and partially maintained high resolution structures in the trabecular bone. The bone segmentations on synthetic CT images showed some false positives on tendons, but the general shape of the bone was accurately reconstructed. Conclusions: This study demonstrates that high quality synthetic CT can be generated from MRI scans of the lower arm. The good correspondence of the bone segmentations demonstrates that synthetic CT could be competitive with real CT in applications that depend on such segmentations, such as planning of orthopedic surgery and 3D printing. |
Tasks | |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08449v1 |
http://arxiv.org/pdf/1901.08449v1.pdf | |
PWC | https://paperswithcode.com/paper/ct-synthesis-from-mr-images-for-orthopedic |
Repo | |
Framework | |
Mix and Match: Markov Chains & Mixing Times for Matching in Rideshare
Title | Mix and Match: Markov Chains & Mixing Times for Matching in Rideshare |
Authors | Michael J. Curry, John P. Dickerson, Karthik Abinav Sankararaman, Aravind Srinivasan, Yuhao Wan, Pan Xu |
Abstract | Rideshare platforms such as Uber and Lyft dynamically dispatch drivers to match riders’ requests. We model the dispatching process in rideshare as a Markov chain that takes into account the geographic mobility of both drivers and riders over time. Prior work explores dispatch policies in the limit of such Markov chains; we characterize when this limit assumption is valid, under a variety of natural dispatch policies. We give explicit bounds on convergence in general, and exact (including constants) convergence rates for special cases. Then, on simulated and real transit data, we show that our bounds characterize convergence rates – even when the necessary theoretical assumptions are relaxed. Additionally these policies compare well against a standard reinforcement learning algorithm which optimizes for profit without any convergence properties. |
Tasks | |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.00225v1 |
https://arxiv.org/pdf/1912.00225v1.pdf | |
PWC | https://paperswithcode.com/paper/mix-and-match-markov-chains-mixing-times-for |
Repo | |
Framework | |
To complete or to estimate, that is the question: A Multi-Task Approach to Depth Completion and Monocular Depth Estimation
Title | To complete or to estimate, that is the question: A Multi-Task Approach to Depth Completion and Monocular Depth Estimation |
Authors | Amir Atapour-Abarghouei, Toby P. Breckon |
Abstract | Robust three-dimensional scene understanding is now an ever-growing area of research highly relevant in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based model capable of performing two tasks:- sparse depth completion (i.e. generating complete dense scene depth given a sparse depth image as the input) and monocular depth estimation (i.e. predicting scene depth from a single RGB image) via two sub-networks jointly trained end to end using data randomly sampled from a publicly available corpus of synthetic and real-world images. The first sub-network generates a sparse depth image by learning lower level features from the scene and the second predicts a full dense depth image of the entire scene, leading to a better geometric and contextual understanding of the scene and, as a result, superior performance of the approach. The entire model can be used to infer complete scene depth from a single RGB image or the second network can be used alone to perform depth completion given a sparse depth input. Using adversarial training, a robust objective function, a deep architecture relying on skip connections and a blend of synthetic and real-world training data, our approach is capable of producing superior high quality scene depth. Extensive experimental evaluation demonstrates the efficacy of our approach compared to contemporary state-of-the-art techniques across both problem domains. |
Tasks | Autonomous Driving, Depth Completion, Depth Estimation, Monocular Depth Estimation, Multi-Task Learning, Scene Understanding |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.05540v1 |
https://arxiv.org/pdf/1908.05540v1.pdf | |
PWC | https://paperswithcode.com/paper/to-complete-or-to-estimate-that-is-the |
Repo | |
Framework | |
Option Comparison Network for Multiple-choice Reading Comprehension
Title | Option Comparison Network for Multiple-choice Reading Comprehension |
Authors | Qiu Ran, Peng Li, Weiwei Hu, Jie Zhou |
Abstract | Multiple-choice reading comprehension (MCRC) is the task of selecting the correct answer from multiple options given a question and an article. Existing MCRC models typically either read each option independently or compute a fixed-length representation for each option before comparing them. However, humans typically compare the options at multiple-granularity level before reading the article in detail to make reasoning more efficient. Mimicking humans, we propose an option comparison network (OCN) for MCRC which compares options at word-level to better identify their correlations to help reasoning. Specially, each option is encoded into a vector sequence using a skimmer to retain fine-grained information as much as possible. An attention mechanism is leveraged to compare these sequences vector-by-vector to identify more subtle correlations between options, which is potentially valuable for reasoning. Experimental results on the human English exam MCRC dataset RACE show that our model outperforms existing methods significantly. Moreover, it is also the first model that surpasses Amazon Mechanical Turker performance on the whole dataset. |
Tasks | Reading Comprehension |
Published | 2019-03-07 |
URL | http://arxiv.org/abs/1903.03033v1 |
http://arxiv.org/pdf/1903.03033v1.pdf | |
PWC | https://paperswithcode.com/paper/option-comparison-network-for-multiple-choice |
Repo | |
Framework | |
Solving Electrical Impedance Tomography with Deep Learning
Title | Solving Electrical Impedance Tomography with Deep Learning |
Authors | Yuwei Fan, Lexing Ying |
Abstract | This paper introduces a new approach for solving electrical impedance tomography (EIT) problems using deep neural networks. The mathematical problem of EIT is to invert the electrical conductivity from the Dirichlet-to-Neumann (DtN) map. Both the forward map from the electrical conductivity to the DtN map and the inverse map are high-dimensional and nonlinear. Motivated by the linear perturbative analysis of the forward map and based on a numerically low-rank property, we propose compact neural network architectures for the forward and inverse maps for both 2D and 3D problems. Numerical results demonstrate the efficiency of the proposed neural networks. |
Tasks | |
Published | 2019-06-06 |
URL | https://arxiv.org/abs/1906.03944v2 |
https://arxiv.org/pdf/1906.03944v2.pdf | |
PWC | https://paperswithcode.com/paper/solving-electrical-impedance-tomography-with |
Repo | |
Framework | |
COSET: A Benchmark for Evaluating Neural Program Embeddings
Title | COSET: A Benchmark for Evaluating Neural Program Embeddings |
Authors | Ke Wang, Mihai Christodorescu |
Abstract | Neural program embedding can be helpful in analyzing large software, a task that is challenging for traditional logic-based program analyses due to their limited scalability. A key focus of recent machine-learning advances in this area is on modeling program semantics instead of just syntax. Unfortunately evaluating such advances is not obvious, as program semantics does not lend itself to straightforward metrics. In this paper, we introduce a benchmarking framework called COSET for standardizing the evaluation of neural program embeddings. COSET consists of a diverse dataset of programs in source-code format, labeled by human experts according to a number of program properties of interest. A point of novelty is a suite of program transformations included in COSET. These transformations when applied to the base dataset can simulate natural changes to program code due to optimization and refactoring and can serve as a “debugging” tool for classification mistakes. We conducted a pilot study on four prominent models: TreeLSTM, gated graph neural network (GGNN), AST-Path neural network (APNN), and DYPRO. We found that COSET is useful in identifying the strengths and limitations of each model and in pinpointing specific syntactic and semantic characteristics of programs that pose challenges. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11445v1 |
https://arxiv.org/pdf/1905.11445v1.pdf | |
PWC | https://paperswithcode.com/paper/coset-a-benchmark-for-evaluating-neural |
Repo | |
Framework | |
Learning Hierarchically Structured Concepts
Title | Learning Hierarchically Structured Concepts |
Authors | Nancy Lynch, Frederik Mallmann-Trenn |
Abstract | We study the question of how concepts that have structure get represented in the brain. Specifically, we introduce a model for hierarchically structured concepts and we show how a biologically plausible neural network can recognize these concepts, and how it can learn them in the first place. Our main goal is to introduce a general framework for these tasks and prove formally how both (recognition and learning) can be achieved. We show that both tasks can be accomplished even in presence of noise. For learning, we analyze Oja’s rule formally, a well-known biologically-plausible rule for adjusting the weights of synapses. We complement the learning results with lower bounds asserting that, in order to recognize concepts of a certain hierarchical depth, neural networks must have a corresponding number of layers. |
Tasks | |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04559v2 |
https://arxiv.org/pdf/1909.04559v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-hierarchically-structured-concepts |
Repo | |
Framework | |
Integer Discrete Flows and Lossless Compression
Title | Integer Discrete Flows and Lossless Compression |
Authors | Emiel Hoogeboom, Jorn W. T. Peters, Rianne van den Berg, Max Welling |
Abstract | Lossless compression methods shorten the expected representation size of data without loss of information, using a statistical model. Flow-based models are attractive in this setting because they admit exact likelihood optimization, which is equivalent to minimizing the expected number of bits per message. However, conventional flows assume continuous data, which may lead to reconstruction errors when quantized for compression. For that reason, we introduce a flow-based generative model for ordinal discrete data called Integer Discrete Flow (IDF): a bijective integer map that can learn rich transformations on high-dimensional data. As building blocks for IDFs, we introduce a flexible transformation layer called integer discrete coupling. Our experiments show that IDFs are competitive with other flow-based generative models. Furthermore, we demonstrate that IDF based compression achieves state-of-the-art lossless compression rates on CIFAR10, ImageNet32, and ImageNet64. To the best of our knowledge, this is the first lossless compression method that uses invertible neural networks. |
Tasks | |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07376v4 |
https://arxiv.org/pdf/1905.07376v4.pdf | |
PWC | https://paperswithcode.com/paper/integer-discrete-flows-and-lossless |
Repo | |
Framework | |
Correlation Clustering with Adaptive Similarity Queries
Title | Correlation Clustering with Adaptive Similarity Queries |
Authors | Marco Bressan, Nicolò Cesa-Bianchi, Andrea Paudice, Fabio Vitale |
Abstract | In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries. On the one hand, we describe simple active learning algorithms, which provably achieve an almost optimal trade-off while giving cluster recovery guarantees, and we test them on different datasets. On the other hand, we prove information-theoretical bounds on the number of queries necessary to guarantee a prescribed disagreement bound. These results give a rich characterization of the trade-off between queries and clustering error. |
Tasks | Active Learning |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11902v3 |
https://arxiv.org/pdf/1905.11902v3.pdf | |
PWC | https://paperswithcode.com/paper/correlation-clustering-with-adaptive |
Repo | |
Framework | |
Bridging Dialogue Generation and Facial Expression Synthesis
Title | Bridging Dialogue Generation and Facial Expression Synthesis |
Authors | Shang-Yu Su, Yun-Nung Chen |
Abstract | Spoken dialogue systems that assist users to solve complex tasks such as movie ticket booking have become an emerging research topic in artificial intelligence and natural language processing areas. With a well-designed dialogue system as an intelligent personal assistant, people can accomplish certain tasks more easily via natural language interactions. Today there are several virtual intelligent assistants in the market; however, most systems only focus on single modality, such as textual or vocal interaction. A multimodal interface has various advantages: (1) allowing human to communicate with machines in a natural and concise form using the mixture of modalities that most precisely convey the intention to satisfy communication needs, and (2) providing more engaging experience by natural and human-like feedback. This paper explores a brand new research direction, which aims at bridging dialogue generation and facial expression synthesis for better multimodal interaction. The goal is to generate dialogue responses and simultaneously synthesize corresponding visual expressions on faces, which is also an ultimate step toward more human-like virtual assistants. |
Tasks | Dialogue Generation, Spoken Dialogue Systems |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.11240v2 |
https://arxiv.org/pdf/1905.11240v2.pdf | |
PWC | https://paperswithcode.com/paper/bridging-dialogue-generation-and-facial |
Repo | |
Framework | |
Theme-aware generation model for chinese lyrics
Title | Theme-aware generation model for chinese lyrics |
Authors | Jie Wang, Xinyan Zhao |
Abstract | With rapid development of neural networks, deep-learning has been extended to various natural language generation fields, such as machine translation, dialogue generation and even literature creation. In this paper, we propose a theme-aware language generation model for Chinese music lyrics, which improves the theme-connectivity and coherence of generated paragraphs greatly. A multi-channel sequence-to-sequence (seq2seq) model encodes themes and previous sentences as global and local contextual information. Moreover, attention mechanism is incorporated for sequence decoding, enabling to fuse context into predicted next texts. To prepare appropriate train corpus, LDA (Latent Dirichlet Allocation) is applied for theme extraction. Generated lyrics is grammatically correct and semantically coherent with selected themes, which offers a valuable modelling method in other fields including multi-turn chatbots, long paragraph generation and etc. |
Tasks | Dialogue Generation, Machine Translation, Text Generation |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1906.02134v1 |
https://arxiv.org/pdf/1906.02134v1.pdf | |
PWC | https://paperswithcode.com/paper/190602134 |
Repo | |
Framework | |
A General Optimization-based Framework for Local Odometry Estimation with Multiple Sensors
Title | A General Optimization-based Framework for Local Odometry Estimation with Multiple Sensors |
Authors | Tong Qin, Jie Pan, Shaozu Cao, Shaojie Shen |
Abstract | Nowadays, more and more sensors are equipped on robots to increase robustness and autonomous ability. We have seen various sensor suites equipped on different platforms, such as stereo cameras on ground vehicles, a monocular camera with an IMU (Inertial Measurement Unit) on mobile phones, and stereo cameras with an IMU on aerial robots. Although many algorithms for state estimation have been proposed in the past, they are usually applied to a single sensor or a specific sensor suite. Few of them can be employed with multiple sensor choices. In this paper, we proposed a general optimization-based framework for odometry estimation, which supports multiple sensor sets. Every sensor is treated as a general factor in our framework. Factors which share common state variables are summed together to build the optimization problem. We further demonstrate the generality with visual and inertial sensors, which form three sensor suites (stereo cameras, a monocular camera with an IMU, and stereo cameras with an IMU). We validate the performance of our system on public datasets and through real-world experiments with multiple sensors. Results are compared against other state-of-the-art algorithms. We highlight that our system is a general framework, which can easily fuse various sensors in a pose graph optimization. Our implementations are open source\footnote{https://github.com/HKUST-Aerial-Robotics/VINS-Fusion}. |
Tasks | Visual Odometry |
Published | 2019-01-11 |
URL | http://arxiv.org/abs/1901.03638v1 |
http://arxiv.org/pdf/1901.03638v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-optimization-based-framework-for |
Repo | |
Framework | |
A Study of Multilingual Neural Machine Translation
Title | A Study of Multilingual Neural Machine Translation |
Authors | Xu Tan, Yichong Leng, Jiale Chen, Yi Ren, Tao Qin, Tie-Yan Liu |
Abstract | Multilingual neural machine translation (NMT) has recently been investigated from different aspects (e.g., pivot translation, zero-shot translation, fine-tuning, or training from scratch) and in different settings (e.g., rich resource and low resource, one-to-many, and many-to-one translation). This paper concentrates on a deep understanding of multilingual NMT and conducts a comprehensive study on a multilingual dataset with more than 20 languages. Our results show that (1) low-resource language pairs benefit much from multilingual training, while rich-resource language pairs may get hurt under limited model capacity and training with similar languages benefits more than dissimilar languages; (2) fine-tuning performs better than training from scratch in the one-to-many setting while training from scratch performs better in the many-to-one setting; (3) the bottom layers of the encoder and top layers of the decoder capture more language-specific information, and just fine-tuning these parts can achieve good accuracy for low-resource language pairs; (4) direct translation is better than pivot translation when the source language is similar to the target language (e.g., in the same language branch), even when the size of direct training data is much smaller; (5) given a fixed training data budget, it is better to introduce more languages into multilingual training for zero-shot translation. |
Tasks | Machine Translation |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11625v1 |
https://arxiv.org/pdf/1912.11625v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-of-multilingual-neural-machine |
Repo | |
Framework | |
Projective Quadratic Regression for Online Learning
Title | Projective Quadratic Regression for Online Learning |
Authors | Wenye Ma |
Abstract | This paper considers online convex optimization (OCO) problems - the paramount framework for online learning algorithm design. The loss function of learning task in OCO setting is based on streaming data so that OCO is a powerful tool to model large scale applications such as online recommender systems. Meanwhile, real-world data are usually of extreme high-dimensional due to modern feature engineering techniques so that the quadratic regression is impractical. Factorization Machine as well as its variants are efficient models for capturing feature interactions with low-rank matrix model but they can’t fulfill the OCO setting due to their non-convexity. In this paper, We propose a projective quadratic regression (PQR) model. First, it can capture the import second-order feature information. Second, it is a convex model, so the requirements of OCO are fulfilled and the global optimal solution can be achieved. Moreover, existing modern online optimization methods such as Online Gradient Descent (OGD) or Follow-The-Regularized-Leader (FTRL) can be applied directly. In addition, by choosing a proper hyper-parameter, we show that it has the same order of space and time complexity as the linear model and thus can handle high-dimensional data. Experimental results demonstrate the performance of the proposed PQR model in terms of accuracy and efficiency by comparing with the state-of-the-art methods. |
Tasks | Feature Engineering, Recommendation Systems |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.10658v1 |
https://arxiv.org/pdf/1911.10658v1.pdf | |
PWC | https://paperswithcode.com/paper/projective-quadratic-regression-for-online |
Repo | |
Framework | |
Personalized Dialogue Generation with Diversified Traits
Title | Personalized Dialogue Generation with Diversified Traits |
Authors | Yinhe Zheng, Guanyi Chen, Minlie Huang, Song Liu, Xuan Zhu |
Abstract | Endowing a dialogue system with particular personality traits is essential to deliver more human-like conversations. However, due to the challenge of embodying personality via language expression and the lack of large-scale persona-labeled dialogue data, this research problem is still far from well-studied. In this paper, we investigate the problem of incorporating explicit personality traits in dialogue generation to deliver personalized dialogues. To this end, firstly, we construct PersonalDialog, a large-scale multi-turn dialogue dataset containing various traits from a large number of speakers. The dataset consists of 20.83M sessions and 56.25M utterances from 8.47M speakers. Each utterance is associated with a speaker who is marked with traits like Age, Gender, Location, Interest Tags, etc. Several anonymization schemes are designed to protect the privacy of each speaker. This large-scale dataset will facilitate not only the study of personalized dialogue generation, but also other researches on sociolinguistics or social science. Secondly, to study how personality traits can be captured and addressed in dialogue generation, we propose persona-aware dialogue generation models within the sequence to sequence learning framework. Explicit personality traits (structured by key-value pairs) are embedded using a trait fusion module. During the decoding process, two techniques, namely persona-aware attention and persona-aware bias, are devised to capture and address trait-related information. Experiments demonstrate that our model is able to address proper traits in different contexts. Case studies also show interesting results for this challenging research problem. |
Tasks | Dialogue Generation |
Published | 2019-01-28 |
URL | https://arxiv.org/abs/1901.09672v2 |
https://arxiv.org/pdf/1901.09672v2.pdf | |
PWC | https://paperswithcode.com/paper/personalized-dialogue-generation-with |
Repo | |
Framework | |