Paper Group ANR 886
Batch Face Alignment using a Low-rank GAN. A Survey of Crowdsourcing in Medical Image Analysis. An evolutionary model that satisfies detailed balance. How to define co-occurrence in different domains of study?. 3D Deep Affine-Invariant Shape Learning for Brain MR Image Segmentation. Efficient Object Annotation via Speaking and Pointing. Learning Mu …
Batch Face Alignment using a Low-rank GAN
Title | Batch Face Alignment using a Low-rank GAN |
Authors | Jiabo Huang, Xiaohua Xie, Wei-Shi Zheng |
Abstract | This paper studies the problem of aligning a set of face images of the same individual into a normalized image while removing the outliers like partial occlusion, extreme facial expression as well as significant illumination variation. Our model seeks an optimal image domain transformation such that the matrix of misaligned images can be decomposed as the sum of a sparse matrix of noise and a rank-one matrix of aligned images. The image transformation is learned in an unsupervised manner, which means that ground-truth aligned images are unnecessary for our model. Specifically, we make use of the remarkable non-linear transforming ability of generative adversarial network(GAN) and guide it with low-rank generation as well as sparse noise constraint to achieve the face alignment. We verify the efficacy of the proposed model with extensive experiments on real-world face databases, demonstrating higher accuracy and efficiency than existing methods. |
Tasks | Face Alignment |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09244v1 |
https://arxiv.org/pdf/1910.09244v1.pdf | |
PWC | https://paperswithcode.com/paper/batch-face-alignment-using-a-low-rank-gan |
Repo | |
Framework | |
A Survey of Crowdsourcing in Medical Image Analysis
Title | A Survey of Crowdsourcing in Medical Image Analysis |
Authors | Silas Ørting, Andrew Doyle, Arno van Hilten, Matthias Hirth, Oana Inel, Christopher R. Madan, Panagiotis Mavridis, Helen Spiers, Veronika Cheplygina |
Abstract | Rapid advances in image processing capabilities have been seen across many domains, fostered by the application of machine learning algorithms to “big-data”. However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with producing large amounts of high-quality meta-data. Recently, there has been growing interest in the application of crowdsourcing for this purpose; a technique that has proven effective for creating large-scale datasets across a range of disciplines, from computer vision to astrophysics. Despite the growing popularity of this approach, there has not yet been a comprehensive literature review to provide guidance to researchers considering using crowdsourcing methodologies in their own medical imaging analysis. In this survey, we review studies applying crowdsourcing to the analysis of medical images, published prior to July 2018. We identify common approaches, challenges and considerations, providing guidance of utility to researchers adopting this approach. Finally, we discuss future opportunities for development within this emerging domain. |
Tasks | |
Published | 2019-02-25 |
URL | https://arxiv.org/abs/1902.09159v2 |
https://arxiv.org/pdf/1902.09159v2.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-crowdsourcing-in-medical-image |
Repo | |
Framework | |
An evolutionary model that satisfies detailed balance
Title | An evolutionary model that satisfies detailed balance |
Authors | Jüri Lember, Chris Watkins |
Abstract | We propose a class of evolutionary models that involves an arbitrary exchangeable process as the breeding process and different selection schemes. In those models, a new genome is born according to the breeding process, and then a genome is removed according to the selection scheme that involves fitness. Thus the population size remains constant. The process evolves according to a Markov chain, and, unlike in many other existing models, the stationary distribution – so called mutation-selection equilibrium – can be easily found and studied. The behaviour of the stationary distribution when the population size increases is our main object of interest. Several phase-transition theorems are proved. |
Tasks | |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10834v1 |
http://arxiv.org/pdf/1902.10834v1.pdf | |
PWC | https://paperswithcode.com/paper/an-evolutionary-model-that-satisfies-detailed |
Repo | |
Framework | |
How to define co-occurrence in different domains of study?
Title | How to define co-occurrence in different domains of study? |
Authors | Mathieu Roche |
Abstract | This position paper presents a comparative study of co-occurrences. Some similarities and differences in the definition exist depending on the research domain (e.g. linguistics, NLP, computer science). This paper discusses these points, and deals with the methodological aspects in order to identify co-occurrences in a multidisciplinary paradigm. |
Tasks | |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.08010v1 |
http://arxiv.org/pdf/1904.08010v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-define-co-occurrence-in-different |
Repo | |
Framework | |
3D Deep Affine-Invariant Shape Learning for Brain MR Image Segmentation
Title | 3D Deep Affine-Invariant Shape Learning for Brain MR Image Segmentation |
Authors | Zhou He, Siqi Bao, Albert Chung |
Abstract | Recent advancements in medical image segmentation techniques have achieved compelling results. However, most of the widely used approaches do not take into account any prior knowledge about the shape of the biomedical structures being segmented. More recently, some works have presented approaches to incorporate shape information. However, many of them are indeed introducing more parameters to the segmentation network to learn the general features, which any segmentation network is able learn, instead of specifically shape features. In this paper, we present a novel approach that seamlessly integrates the shape information into the segmentation network. Experiments on human brain MRI segmentation demonstrate that our approach can achieve a lower Hausdorff distance and higher Dice coefficient than the state-of-the-art approaches. |
Tasks | Medical Image Segmentation, Semantic Segmentation |
Published | 2019-09-14 |
URL | https://arxiv.org/abs/1909.06629v2 |
https://arxiv.org/pdf/1909.06629v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-deep-affine-invariant-shape-learning-for |
Repo | |
Framework | |
Efficient Object Annotation via Speaking and Pointing
Title | Efficient Object Annotation via Speaking and Pointing |
Authors | Michael Gygli, Vittorio Ferrari |
Abstract | Deep neural networks deliver state-of-the-art visual recognition, but they rely on large datasets, which are time-consuming to annotate. These datasets are typically annotated in two stages: (1) determining the presence of object classes at the image level and (2) marking the spatial extent for all objects of these classes. In this work we use speech, together with mouse inputs, to speed up this process. We first improve stage one, by letting annotators indicate object class presence via speech. We then combine the two stages: annotators draw an object bounding box via the mouse and simultaneously provide its class label via speech. Using speech has distinct advantages over relying on mouse inputs alone. First, it is fast and allows for direct access to the class name, by simply saying it. Second, annotators can simultaneously speak and mark an object location. Finally, speech-based interfaces can be kept extremely simple, hence using them requires less mouse movement compared to existing approaches. Through extensive experiments on the COCO and ILSVRC datasets we show that our approach yields high-quality annotations at significant speed gains. Stage one takes 2.3x - 14.9x less annotation time than existing methods based on a hierarchical organization of the classes to be annotated. Moreover, when combining the two stages, we find that object class labels come for free: annotating them at the same time as bounding boxes has zero additional cost. On COCO, this makes the overall process 1.9x faster than the two-stage approach. |
Tasks | |
Published | 2019-05-25 |
URL | https://arxiv.org/abs/1905.10576v4 |
https://arxiv.org/pdf/1905.10576v4.pdf | |
PWC | https://paperswithcode.com/paper/efficient-object-annotation-via-speaking-and |
Repo | |
Framework | |
Learning Multi-agent Communication under Limited-bandwidth Restriction for Internet Packet Routing
Title | Learning Multi-agent Communication under Limited-bandwidth Restriction for Internet Packet Routing |
Authors | Hangyu Mao, Zhibo Gong, Zhengchao Zhang, Zhen Xiao, Yan Ni |
Abstract | Communication is an important factor for the big multi-agent world to stay organized and productive. Recently, the AI community has applied the Deep Reinforcement Learning (DRL) to learn the communication strategy and the control policy for multiple agents. However, when implementing the communication for real-world multi-agent applications, there is a more practical limited-bandwidth restriction, which has been largely ignored by the existing DRL-based methods. Specifically, agents trained by most previous methods keep sending messages incessantly in every control cycle; due to emitting too many messages, these methods are unsuitable to be applied to the real-world systems that have a limited bandwidth to transmit the messages. To handle this problem, we propose a gating mechanism to adaptively prune unprofitable messages. Results show that the gating mechanism can prune more than 80% messages with little damage to the performance. Moreover, our method outperforms several state-of-the-art DRL-based and rule-based methods by a large margin in both the real-world packet routing tasks and four benchmark tasks. |
Tasks | |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1903.05561v1 |
http://arxiv.org/pdf/1903.05561v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-multi-agent-communication-under |
Repo | |
Framework | |
From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model
Title | From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model |
Authors | Aadirupa Saha, Aditya Gopalan |
Abstract | We consider PAC-learning a good item from $k$-subsetwise feedback information sampled from a Plackett-Luce probability model, with instance-dependent sample complexity performance. In the setting where subsets of a fixed size can be tested and top-ranked feedback is made available to the learner, we give an algorithm with optimal instance-dependent sample complexity, for PAC best arm identification, of $O\bigg(\frac{\theta_{[k]}}{k}\sum_{i = 2}^n\max\Big(1,\frac{1}{\Delta_i^2}\Big) \ln\frac{k}{\delta}\Big(\ln \frac{1}{\Delta_i}\Big)\bigg)$, $\Delta_i$ being the Plackett-Luce parameter gap between the best and the $i^{th}$ best item, and $\theta_{[k]}$ is the sum of the \pl, parameters for the top-$k$ items. The algorithm is based on a wrapper around a PAC winner-finding algorithm with weaker performance guarantees to adapt to the hardness of the input instance. The sample complexity is also shown to be multiplicatively better depending on the length of rank-ordered feedback available in each subset-wise play. We show optimality of our algorithms with matching sample complexity lower bounds. We next address the winner-finding problem in Plackett-Luce models in the fixed-budget setting with instance dependent upper and lower bounds on the misidentification probability, of $\Omega\left(\exp(-2 \tilde \Delta Q) \right)$ for a given budget $Q$, where $\tilde \Delta$ is an explicit instance-dependent problem complexity parameter. Numerical performance results are also reported. |
Tasks | |
Published | 2019-03-01 |
URL | https://arxiv.org/abs/1903.00558v2 |
https://arxiv.org/pdf/1903.00558v2.pdf | |
PWC | https://paperswithcode.com/paper/from-pac-to-instance-optimal-sample |
Repo | |
Framework | |
Document-Level $N$-ary Relation Extraction with Multiscale Representation Learning
Title | Document-Level $N$-ary Relation Extraction with Multiscale Representation Learning |
Authors | Robin Jia, Cliff Wong, Hoifung Poon |
Abstract | Most information extraction methods focus on binary relations expressed within single sentences. In high-value domains, however, $n$-ary relations are of great demand (e.g., drug-gene-mutation interactions in precision oncology). Such relations often involve entity mentions that are far apart in the document, yet existing work on cross-sentence relation extraction is generally confined to small text spans (e.g., three consecutive sentences), which severely limits recall. In this paper, we propose a novel multiscale neural architecture for document-level $n$-ary relation extraction. Our system combines representations learned over various text spans throughout the document and across the subrelation hierarchy. Widening the system’s purview to the entire document maximizes potential recall. Moreover, by integrating weak signals across the document, multiscale modeling increases precision, even in the presence of noisy labels from distant supervision. Experiments on biomedical machine reading show that our approach substantially outperforms previous $n$-ary relation extraction methods. |
Tasks | Reading Comprehension, Relation Extraction, Representation Learning |
Published | 2019-04-04 |
URL | https://arxiv.org/abs/1904.02347v3 |
https://arxiv.org/pdf/1904.02347v3.pdf | |
PWC | https://paperswithcode.com/paper/document-level-n-ary-relation-extraction-with |
Repo | |
Framework | |
Personalizing ASR for Dysarthric and Accented Speech with Limited Data
Title | Personalizing ASR for Dysarthric and Accented Speech with Limited Data |
Authors | Joel Shor, Dotan Emanuel, Oran Lang, Omry Tuval, Michael Brenner, Julie Cattiau, Fernando Vieira, Maeve McNally, Taylor Charbonneau, Melissa Nollstadt, Avinatan Hassidim, Yossi Matias |
Abstract | Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from ‘typical’ speech, which means that underrepresented groups don’t experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech: speech from people with amyotrophic lateral sclerosis (ALS) and accented speech. We train personalized models that achieve 62% and 35% relative WER improvement on these two groups, bringing the absolute WER for ALS speakers, on a test set of message bank phrases, down to 10% for mild dysarthria and 20% for more serious dysarthria. We show that 71% of the improvement comes from only 5 minutes of training data. Finetuning a particular subset of layers (with many fewer parameters) often gives better results than finetuning the entire model. This is the first step towards building state of the art ASR models for dysarthric speech. |
Tasks | Speech Recognition |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1907.13511v1 |
https://arxiv.org/pdf/1907.13511v1.pdf | |
PWC | https://paperswithcode.com/paper/personalizing-asr-for-dysarthric-and-accented |
Repo | |
Framework | |
Using Word Embeddings for Visual Data Exploration with Ontodia and Wikidata
Title | Using Word Embeddings for Visual Data Exploration with Ontodia and Wikidata |
Authors | Gerhard Wohlgenannt, Nikolay Klimov, Dmitry Mouromtsev, Daniil Razdyakonov, Dmitry Pavlov, Yury Emelyanov |
Abstract | One of the big challenges in Linked Data consumption is to create visual and natural language interfaces to the data usable for non-technical users. Ontodia provides support for diagrammatic data exploration, showcased in this publication in combination with the Wikidata dataset. We present improvements to the natural language interface regarding exploring and querying Linked Data entities. The method uses models of distributional semantics to find and rank entity properties related to user input in Ontodia. Various word embedding types and model settings are evaluated, and the results show that user experience in visual data exploration benefits from the proposed approach. |
Tasks | Word Embeddings |
Published | 2019-03-04 |
URL | http://arxiv.org/abs/1903.01275v1 |
http://arxiv.org/pdf/1903.01275v1.pdf | |
PWC | https://paperswithcode.com/paper/using-word-embeddings-for-visual-data |
Repo | |
Framework | |
Factorized Inference in Deep Markov Models for Incomplete Multimodal Time Series
Title | Factorized Inference in Deep Markov Models for Incomplete Multimodal Time Series |
Authors | Tan Zhi-Xuan, Harold Soh, Desmond C. Ong |
Abstract | Integrating deep learning with latent state space models has the potential to yield temporal models that are powerful, yet tractable and interpretable. Unfortunately, current models are not designed to handle missing data or multiple data modalities, which are both prevalent in real-world data. In this work, we introduce a factorized inference method for Multimodal Deep Markov Models (MDMMs), allowing us to filter and smooth in the presence of missing data, while also performing uncertainty-aware multimodal fusion. We derive this method by factorizing the posterior p(zx) for non-linear state space models, and develop a variational backward-forward algorithm for inference. Because our method handles incompleteness over both time and modalities, it is capable of interpolation, extrapolation, conditional generation, label prediction, and weakly supervised learning of multimodal time series. We demonstrate these capabilities on both synthetic and real-world multimodal data under high levels of data deletion. Our method performs well even with more than 50% missing data, and outperforms existing deep approaches to inference in latent time series. |
Tasks | Time Series |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13570v3 |
https://arxiv.org/pdf/1905.13570v3.pdf | |
PWC | https://paperswithcode.com/paper/factorized-inference-in-deep-markov-models |
Repo | |
Framework | |
Tucker Decomposition Network: Expressive Power and Comparison
Title | Tucker Decomposition Network: Expressive Power and Comparison |
Authors | Ye Liu, Junjun Pan, Michael Ng |
Abstract | Deep neural networks have achieved a great success in solving many machine learning and computer vision problems. The main contribution of this paper is to develop a deep network based on Tucker tensor decomposition, and analyze its expressive power. It is shown that the expressiveness of Tucker network is more powerful than that of shallow network. In general, it is required to use an exponential number of nodes in a shallow network in order to represent a Tucker network. Experimental results are also given to compare the performance of the proposed Tucker network with hierarchical tensor network and shallow network, and demonstrate the usefulness of Tucker network in image classification problems. |
Tasks | Image Classification |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09635v1 |
https://arxiv.org/pdf/1905.09635v1.pdf | |
PWC | https://paperswithcode.com/paper/tucker-decomposition-network-expressive-power |
Repo | |
Framework | |
An Enhanced Ad Event-Prediction Method Based on Feature Engineering
Title | An Enhanced Ad Event-Prediction Method Based on Feature Engineering |
Authors | Saeid Soheily Khah, Yiming Wu |
Abstract | In digital advertising, Click-Through Rate (CTR) and Conversion Rate (CVR) are very important metrics for evaluating ad performance. As a result, ad event prediction systems are vital and widely used for sponsored search and display advertising as well as Real-Time Bidding (RTB). In this work, we introduce an enhanced method for ad event prediction (i.e. clicks, conversions) by proposing a new efficient feature engineering approach. A large real-world event-based dataset of a running marketing campaign is used to evaluate the efficiency of the proposed prediction algorithm. The results illustrate the benefits of the proposed ad event prediction approach, which significantly outperforms the alternative ones. |
Tasks | Feature Engineering |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.01959v1 |
https://arxiv.org/pdf/1907.01959v1.pdf | |
PWC | https://paperswithcode.com/paper/an-enhanced-ad-event-prediction-method-based |
Repo | |
Framework | |
Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation
Title | Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation |
Authors | Woong Bae, Seungho Lee, Yeha Lee, Beomhee Park, Minki Chung, Kyu-Hwan Jung |
Abstract | Neural Architecture Search (NAS), a framework which automates the task of designing neural networks, has recently been actively studied in the field of deep learning. However, there are only a few NAS methods suitable for 3D medical image segmentation. Medical 3D images are generally very large; thus it is difficult to apply previous NAS methods due to their GPU computational burden and long training time. We propose the resource-optimized neural architecture search method which can be applied to 3D medical segmentation tasks in a short training time (1.39 days for 1GB dataset) using a small amount of computation power (one RTX 2080Ti, 10.8GB GPU memory). Excellent performance can also be achieved without retraining(fine-tuning) which is essential in most NAS methods. These advantages can be achieved by using a reinforcement learning-based controller with parameter sharing and focusing on the optimal search space configuration of macro search rather than micro search. Our experiments demonstrate that the proposed NAS method outperforms manually designed networks with state-of-the-art performance in 3D medical image segmentation. |
Tasks | Medical Image Segmentation, Neural Architecture Search, Semantic Segmentation |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00548v1 |
https://arxiv.org/pdf/1909.00548v1.pdf | |
PWC | https://paperswithcode.com/paper/resource-optimized-neural-architecture-search |
Repo | |
Framework | |