Paper Group ANR 900
Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity. Deformable Object Tracking with Gated Fusion. KTAN: Knowledge Transfer Adversarial Network. Endoscopic navigation in the absence of CT imaging. Dense Multimodal Fusion for Hierarchically Joint Representation. Active image restoration. Finding Similar Medical Quest …
Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity
Title | Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity |
Authors | Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Abhradeep Thakurta |
Abstract | Sensitive statistics are often collected across sets of users, with repeated collection of reports done over time. For example, trends in users’ private preferences or software usage may be monitored via such reports. We study the collection of such statistics in the local differential privacy (LDP) model, and describe an algorithm whose privacy cost is polylogarithmic in the number of changes to a user’s value. More fundamentally—by building on anonymity of the users’ reports—we also demonstrate how the privacy cost of our LDP algorithm can actually be much lower when viewed in the central model of differential privacy. We show, via a new and general privacy amplification technique, that any permutation-invariant algorithm satisfying $\varepsilon$-local differential privacy will satisfy $(O(\varepsilon \sqrt{\log(1/\delta)/n}), \delta)$-central differential privacy. By this, we explain how the high noise and $\sqrt{n}$ overhead of LDP protocols is a consequence of them being significantly more private in the central model. As a practical corollary, our results imply that several LDP-based industrial deployments may have much lower privacy cost than their advertised $\varepsilon$ would indicate—at least if reports are anonymized. |
Tasks | |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12469v1 |
http://arxiv.org/pdf/1811.12469v1.pdf | |
PWC | https://paperswithcode.com/paper/amplification-by-shuffling-from-local-to |
Repo | |
Framework | |
Deformable Object Tracking with Gated Fusion
Title | Deformable Object Tracking with Gated Fusion |
Authors | Wenxi Liu, Yibing Song, Dengsheng Chen, Shengfeng He, Yuanlong Yu, Tao Yan, Gerhard P. Hancke, Rynson W. H. Lau |
Abstract | The tracking-by-detection framework receives growing attentions through the integration with the Convolutional Neural Networks (CNNs). Existing tracking-by-detection based methods, however, fail to track objects with severe appearance variations. This is because the traditional convolutional operation is performed on fixed grids, and thus may not be able to find the correct response while the object is changing pose or under varying environmental conditions. In this paper, we propose a deformable convolution layer to enrich the target appearance representations in the tracking-by-detection framework. We aim to capture the target appearance variations via deformable convolution, which adaptively enhances its original features. In addition, we also propose a gated fusion scheme to control how the variations captured by the deformable convolution affect the original appearance. The enriched feature representation through deformable convolution facilitates the discrimination of the CNN classifier on the target object and background. Extensive experiments on the standard benchmarks show that the proposed tracker performs favorably against state-of-the-art methods. |
Tasks | Object Tracking |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10417v2 |
http://arxiv.org/pdf/1809.10417v2.pdf | |
PWC | https://paperswithcode.com/paper/deformable-object-tracking-with-gated-fusion |
Repo | |
Framework | |
KTAN: Knowledge Transfer Adversarial Network
Title | KTAN: Knowledge Transfer Adversarial Network |
Authors | Peiye Liu, Wu Liu, Huadong Ma, Tao Mei, Mingoo Seok |
Abstract | To reduce the large computation and storage cost of a deep convolutional neural network, the knowledge distillation based methods have pioneered to transfer the generalization ability of a large (teacher) deep network to a light-weight (student) network. However, these methods mostly focus on transferring the probability distribution of the softmax layer in a teacher network and thus neglect the intermediate representations. In this paper, we propose a knowledge transfer adversarial network to better train a student network. Our technique holistically considers both intermediate representations and probability distributions of a teacher network. To transfer the knowledge of intermediate representations, we set high-level teacher feature maps as a target, toward which the student feature maps are trained. Specifically, we arrange a Teacher-to-Student layer for enabling our framework suitable for various student structures. The intermediate representation helps the student network better understand the transferred generalization as compared to the probability distribution only. Furthermore, we infuse an adversarial learning process by employing a discriminator network, which can fully exploit the spatial correlation of feature maps in training a student network. The experimental results demonstrate that the proposed method can significantly improve the performance of a student network on both image classification and object detection tasks. |
Tasks | Image Classification, Object Detection, Transfer Learning |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.08126v1 |
http://arxiv.org/pdf/1810.08126v1.pdf | |
PWC | https://paperswithcode.com/paper/ktan-knowledge-transfer-adversarial-network |
Repo | |
Framework | |
Endoscopic navigation in the absence of CT imaging
Title | Endoscopic navigation in the absence of CT imaging |
Authors | Ayushi Sinha, Xingtong Liu, Austin Reiter, Masaru Ishii, Gregory D. Hager, Russell H. Taylor |
Abstract | Clinical examinations that involve endoscopic exploration of the nasal cavity and sinuses often do not have a reference image to provide structural context to the clinician. In this paper, we present a system for navigation during clinical endoscopic exploration in the absence of computed tomography (CT) scans by making use of shape statistics from past CT scans. Using a deformable registration algorithm along with dense reconstructions from video, we show that we are able to achieve submillimeter registrations in in-vivo clinical data and are able to assign confidence to these registrations using confidence criteria established using simulated data. |
Tasks | Computed Tomography (CT) |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.03997v1 |
http://arxiv.org/pdf/1806.03997v1.pdf | |
PWC | https://paperswithcode.com/paper/endoscopic-navigation-in-the-absence-of-ct |
Repo | |
Framework | |
Dense Multimodal Fusion for Hierarchically Joint Representation
Title | Dense Multimodal Fusion for Hierarchically Joint Representation |
Authors | Di Hu, Feiping Nie, Xuelong Li |
Abstract | Multiple modalities can provide more valuable information than single one by describing the same contents in various ways. Hence, it is highly expected to learn effective joint representation by fusing the features of different modalities. However, previous methods mainly focus on fusing the shallow features or high-level representations generated by unimodal deep networks, which only capture part of the hierarchical correlations across modalities. In this paper, we propose to densely integrate the representations by greedily stacking multiple shared layers between different modality-specific networks, which is named as Dense Multimodal Fusion (DMF). The joint representations in different shared layers can capture the correlations in different levels, and the connection between shared layers also provides an efficient way to learn the dependence among hierarchical correlations. These two properties jointly contribute to the multiple learning paths in DMF, which results in faster convergence, lower training loss, and better performance. We evaluate our model on three typical multimodal learning tasks, including audiovisual speech recognition, cross-modal retrieval, and multimodal classification. The noticeable performance in the experiments demonstrates that our model can learn more effective joint representation. |
Tasks | Cross-Modal Retrieval, Speech Recognition |
Published | 2018-10-08 |
URL | http://arxiv.org/abs/1810.03414v1 |
http://arxiv.org/pdf/1810.03414v1.pdf | |
PWC | https://paperswithcode.com/paper/dense-multimodal-fusion-for-hierarchically |
Repo | |
Framework | |
Active image restoration
Title | Active image restoration |
Authors | Rongrong Xie, Shengfeng Deng, Weibing Deng, Armen E. Allahverdyan |
Abstract | We study active restoration of noise-corrupted images generated via the Gibbs probability of an Ising ferromagnet in external magnetic field. Ferromagnetism accounts for the prior expectation of data smoothness, i.e. a positive correlation between neighbouring pixels (Ising spins), while the magnetic field refers to the bias. The restoration is actively supervised by requesting the true values of certain pixels after a noisy observation. This additional information improves restoration of other pixels. The optimal strategy of active inference is not known for realistic (two-dimensional) images. We determine this strategy for the mean-field version of the model and show that it amounts to supervising the values of spins (pixels) that do not agree with the sign of the average magnetization. The strategy leads to a transparent analytical expression for the minimal Bayesian risk, and shows that there is a maximal number of pixels beyond of which the supervision is useless. We show numerically that this strategy applies for two-dimensional images away from the critical regime. Within this regime the strategy is outperformed by its local (adaptive) version, which supervises pixels that do not agree with their Bayesian estimate. We show on transparent examples how active supervising can be essential in recovering noise-corrupted images and advocate for a wider usage of active methods in image restoration. |
Tasks | Image Restoration |
Published | 2018-09-22 |
URL | http://arxiv.org/abs/1809.08406v1 |
http://arxiv.org/pdf/1809.08406v1.pdf | |
PWC | https://paperswithcode.com/paper/active-image-restoration |
Repo | |
Framework | |
Finding Similar Medical Questions from Question Answering Websites
Title | Finding Similar Medical Questions from Question Answering Websites |
Authors | Yaliang Li, Liuyi Yao, Nan Du, Jing Gao, Qi Li, Chuishi Meng, Chenwei Zhang, Wei Fan |
Abstract | The past few years have witnessed the flourishing of crowdsourced medical question answering (Q&A) websites. Patients who have medical information demands tend to post questions about their health conditions on these crowdsourced Q&A websites and get answers from other users. However, we observe that a large portion of new medical questions cannot be answered in time or receive only few answers from these websites. On the other hand, we notice that solved questions have great potential to solve this challenge. Motivated by these, we propose an end-to-end system that can automatically find similar questions for unsolved medical questions. By learning the vector presentation of unsolved questions and their candidate similar questions, the proposed system outputs similar questions according to the similarity between vector representations. Through the vector representation, the similar questions are found at the question level, and the diversity of medical questions expression issue can be addressed. Further, we handle two more important issues, i.e., training data generation issue and efficiency issue, associated with the LSTM training procedure and the retrieval of candidate similar questions. The effectiveness of the proposed system is validated on a large-scale real-world dataset collected from a crowdsourced maternal-infant Q&A website. |
Tasks | Question Answering |
Published | 2018-10-14 |
URL | http://arxiv.org/abs/1810.05983v1 |
http://arxiv.org/pdf/1810.05983v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-similar-medical-questions-from |
Repo | |
Framework | |
Where’s YOUR focus: Personalized Attention
Title | Where’s YOUR focus: Personalized Attention |
Authors | Sikun Lin, Pan Hui |
Abstract | Human visual attention is subjective and biased according to the personal preference of the viewer, however, current works of saliency detection are general and objective, without counting the factor of the observer. This will make the attention prediction for a particular person not accurate enough. In this work, we present the novel idea of personalized attention prediction and develop Personalized Attention Network (PANet), a convolutional network that predicts saliency in images with personal preference. The model consists of two streams which share common feature extraction layers, and one stream is responsible for saliency prediction, while the other is adapted from the detection model and used to fit user preference. We automatically collect user preference from their albums and leaves them freedom to define what and how many categories their preferences are divided into. To train PANet, we dynamically generate ground truth saliency maps upon existing detection labels and saliency labels, and the generation parameters are based upon our collected datasets consists of 1k images. We evaluate the model with saliency prediction metrics and test the trained model on different preference vectors. The results have shown that our system is much better than general models in personalized saliency prediction and is efficient to use for different preferences. |
Tasks | Saliency Detection, Saliency Prediction |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07931v1 |
http://arxiv.org/pdf/1802.07931v1.pdf | |
PWC | https://paperswithcode.com/paper/wheres-your-focus-personalized-attention |
Repo | |
Framework | |
Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval
Title | Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval |
Authors | Niluthpol Chowdhury Mithun, Rameswar Panda, Evangelos E. Papalexakis, Amit K. Roy-Chowdhury |
Abstract | Cross-modal retrieval between visual data and natural language description remains a long-standing challenge in multimedia. While recent image-text retrieval methods offer great promise by learning deep representations aligned across modalities, most of these methods are plagued by the issue of training with small-scale datasets covering a limited number of images with ground-truth sentences. Moreover, it is extremely expensive to create a larger dataset by annotating millions of images with sentences and may lead to a biased model. Inspired by the recent success of webly supervised learning in deep neural networks, we capitalize on readily-available web images with noisy annotations to learn robust image-text joint representation. Specifically, our main idea is to leverage web images and corresponding tags, along with fully annotated datasets, in training for learning the visual-semantic joint embedding. We propose a two-stage approach for the task that can augment a typical supervised pair-wise ranking loss based formulation with weakly-annotated web images to learn a more robust visual-semantic embedding. Experiments on two standard benchmark datasets demonstrate that our method achieves a significant performance gain in image-text retrieval compared to state-of-the-art approaches. |
Tasks | Cross-Modal Retrieval |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07793v1 |
http://arxiv.org/pdf/1808.07793v1.pdf | |
PWC | https://paperswithcode.com/paper/webly-supervised-joint-embedding-for-cross |
Repo | |
Framework | |
Learning Discriminative Hashing Codes for Cross-Modal Retrieval based on Multi-view Features
Title | Learning Discriminative Hashing Codes for Cross-Modal Retrieval based on Multi-view Features |
Authors | Jun Yu, Xiao-Jun Wu, Josef Kittler |
Abstract | Hashing techniques have been applied broadly in retrieval tasks due to their low storage requirements and high speed of processing. Many hashing methods based on a single view have been extensively studied for information retrieval. However, the representation capacity of a single view is insufficient and some discriminative information is not captured, which results in limited improvement. In this paper, we employ multiple views to represent images and texts for enriching the feature information. Our framework exploits the complementary information among multiple views to better learn the discriminative compact hash codes. A discrete hashing learning framework that jointly performs classifier learning and subspace learning is proposed to complete multiple search tasks simultaneously. Our framework includes two stages, namely a kernelization process and a quantization process. Kernelization aims to find a common subspace where multi-view features can be fused. The quantization stage is designed to learn discriminative unified hashing codes. Extensive experiments are performed on single-label datasets (WiKi and MMED) and multi-label datasets (MIRFlickr and NUS-WIDE) and the experimental results indicate the superiority of our method compared with the state-of-the-art methods. |
Tasks | Cross-Modal Retrieval, Information Retrieval, Quantization |
Published | 2018-08-13 |
URL | https://arxiv.org/abs/1808.04152v3 |
https://arxiv.org/pdf/1808.04152v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-discriminative-hashing-codes-for |
Repo | |
Framework | |
Revisiting Cross Modal Retrieval
Title | Revisiting Cross Modal Retrieval |
Authors | Shah Nawaz, Muhammad Kamran Janjua, Alessandro Calefati, Ignazio Gallo |
Abstract | This paper proposes a cross-modal retrieval system that leverages on image and text encoding. Most multimodal architectures employ separate networks for each modality to capture the semantic relationship between them. However, in our work image-text encoding can achieve comparable results in terms of cross-modal retrieval without having to use a separate network for each modality. We show that text encodings can capture semantic relationships between multiple modalities. In our knowledge, this work is the first of its kind in terms of employing a single network and fused image-text embedding for cross-modal retrieval. We evaluate our approach on two famous multimodal datasets: MS-COCO and Flickr30K. |
Tasks | Cross-Modal Retrieval |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07364v1 |
http://arxiv.org/pdf/1807.07364v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-cross-modal-retrieval |
Repo | |
Framework | |
Bayesian Deep Learning for Exoplanet Atmospheric Retrieval
Title | Bayesian Deep Learning for Exoplanet Atmospheric Retrieval |
Authors | Frank Soboczenski, Michael D. Himes, Molly D. O’Beirne, Simone Zorzan, Atilim Gunes Baydin, Adam D. Cobb, Yarin Gal, Daniel Angerhausen, Massimo Mascaro, Giada N. Arney, Shawn D. Domagal-Goldman |
Abstract | Over the past decade, the study of extrasolar planets has evolved rapidly from plain detection and identification to comprehensive categorization and characterization of exoplanet systems and their atmospheres. Atmospheric retrieval, the inverse modeling technique used to determine an exoplanetary atmosphere’s temperature structure and composition from an observed spectrum, is both time-consuming and compute-intensive, requiring complex algorithms that compare thousands to millions of atmospheric models to the observational data to find the most probable values and associated uncertainties for each model parameter. For rocky, terrestrial planets, the retrieved atmospheric composition can give insight into the surface fluxes of gaseous species necessary to maintain the stability of that atmosphere, which may in turn provide insight into the geological and/or biological processes active on the planet. These atmospheres contain many molecules, some of them biosignatures, spectral fingerprints indicative of biological activity, which will become observable with the next generation of telescopes. Runtimes of traditional retrieval models scale with the number of model parameters, so as more molecular species are considered, runtimes can become prohibitively long. Recent advances in machine learning (ML) and computer vision offer new ways to reduce the time to perform a retrieval by orders of magnitude, given a sufficient data set to train with. Here we present an ML-based retrieval framework called Intelligent exoplaNet Atmospheric RetrievAl (INARA) that consists of a Bayesian deep learning model for retrieval and a data set of 3,000,000 synthetic rocky exoplanetary spectra generated using the NASA Planetary Spectrum Generator. Our work represents the first ML retrieval model for rocky, terrestrial exoplanets and the first synthetic data set of terrestrial spectra generated at this scale. |
Tasks | |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03390v2 |
http://arxiv.org/pdf/1811.03390v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-deep-learning-for-exoplanet |
Repo | |
Framework | |
Weighted Linear Discriminant Analysis based on Class Saliency Information
Title | Weighted Linear Discriminant Analysis based on Class Saliency Information |
Authors | Lei Xu, Alexandros Iosifidis, Moncef Gabbouj |
Abstract | In this paper, we propose a new variant of Linear Discriminant Analysis to overcome underlying drawbacks of traditional LDA and other LDA variants targeting problems involving imbalanced classes. Traditional LDA sets assumptions related to Gaussian class distribution and neglects influence of outlier classes, that might hurt in performance. We exploit intuitions coming from a probabilistic interpretation of visual saliency estimation in order to define saliency of a class in multi-class setting. Such information is then used to redefine the between-class and within-class scatters in a more robust manner. Compared to traditional LDA and other weight-based LDA variants, the proposed method has shown certain improvements on facial image classification problems in publicly available datasets. |
Tasks | Image Classification, Saliency Prediction |
Published | 2018-02-19 |
URL | http://arxiv.org/abs/1802.06547v1 |
http://arxiv.org/pdf/1802.06547v1.pdf | |
PWC | https://paperswithcode.com/paper/weighted-linear-discriminant-analysis-based |
Repo | |
Framework | |
DAPPER: Scaling Dynamic Author Persona Topic Model to Billion Word Corpora
Title | DAPPER: Scaling Dynamic Author Persona Topic Model to Billion Word Corpora |
Authors | Robert Giaquinto, Arindam Banerjee |
Abstract | Extracting common narratives from multi-author dynamic text corpora requires complex models, such as the Dynamic Author Persona (DAP) topic model. However, such models are complex and can struggle to scale to large corpora, often because of challenging non-conjugate terms. To overcome such challenges, in this paper we adapt new ideas in approximate inference to the DAP model, resulting in the DAP Performed Exceedingly Rapidly (DAPPER) topic model. Specifically, we develop Conjugate-Computation Variational Inference (CVI) based variational Expectation-Maximization (EM) for learning the model, yielding fast, closed form updates for each document, replacing iterative optimization in earlier work. Our results show significant improvements in model fit and training time without needing to compromise the model’s temporal structure or the application of Regularized Variation Inference (RVI). We demonstrate the scalability and effectiveness of the DAPPER model by extracting health journeys from the CaringBridge corpus — a collection of 9 million journals written by 200,000 authors during health crises. |
Tasks | |
Published | 2018-11-03 |
URL | http://arxiv.org/abs/1811.01931v1 |
http://arxiv.org/pdf/1811.01931v1.pdf | |
PWC | https://paperswithcode.com/paper/dapper-scaling-dynamic-author-persona-topic |
Repo | |
Framework | |
Solving Pictorial Jigsaw Puzzle by Stigmergy-inspired Internet-based Human Collective Intelligence
Title | Solving Pictorial Jigsaw Puzzle by Stigmergy-inspired Internet-based Human Collective Intelligence |
Authors | Bo Shen, Wei Zhang, Haiyan Zhao, Zhi Jin, Yanhong Wu |
Abstract | The pictorial jigsaw (PJ) puzzle is a well-known leisure game for humans. Usually, a PJ puzzle game is played by one or several human players face-to-face in the physical space. In this paper, we focus on how to solve PJ puzzles in the cyberspace by a group of physically distributed human players. We propose an approach to solving PJ puzzle by stigmergy-inspired Internet-based human collective intelligence. The core of the approach is a continuously executing loop, named the EIF loop, which consists of three activities: exploration, integration, and feedback. In exploration, each player tries to solve the PJ puzzle alone, without direct interactions with other players. At any time, the result of a player’s exploration is a partial solution to the PJ puzzle, and a set of rejected neighboring relation between pieces. The results of all players’ exploration are integrated in real time through integration, with the output of a continuously updated collective opinion graph (COG). And through feedback, each player is provided with personalized feedback information based on the current COG and the player’s exploration result, in order to accelerate his/her puzzle-solving process. Exploratory experiments show that: (1) supported by this approach, the time to solve PJ puzzle is nearly linear to the reciprocal of the number of players, and shows better scalability to puzzle size than that of face-to-face collaboration for 10-player groups; (2) for groups with 2 to 10 players, the puzzle-solving time decreases 31.36%-64.57% on average, compared with the best single players in the experiments. |
Tasks | |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1812.02559v2 |
http://arxiv.org/pdf/1812.02559v2.pdf | |
PWC | https://paperswithcode.com/paper/solving-pictorial-jigsaw-puzzle-by-stigmergy |
Repo | |
Framework | |