Paper Group ANR 284
External Prior Guided Internal Prior Learning for Real-World Noisy Image Denoising. A Cascaded Convolutional Neural Network for X-ray Low-dose CT Image Denoising. Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection. Personalized Age Progression with Bi-level Aging Dictionary Learning. Phone-aware Neural Language Identi …
External Prior Guided Internal Prior Learning for Real-World Noisy Image Denoising
Title | External Prior Guided Internal Prior Learning for Real-World Noisy Image Denoising |
Authors | Jun Xu, Lei Zhang, David Zhang |
Abstract | Most of existing image denoising methods learn image priors from either external data or the noisy image itself to remove noise. However, priors learned from external data may not be adaptive to the image to be denoised, while priors learned from the given noisy image may not be accurate due to the interference of corrupted noise. Meanwhile, the noise in real-world noisy images is very complex, which is hard to be described by simple distributions such as Gaussian distribution, making real-world noisy image denoising a very challenging problem. We propose to exploit the information in both external data and the given noisy image, and develop an external prior guided internal prior learning method for real-world noisy image denoising. We first learn external priors from an independent set of clean natural images. With the aid of learned external priors, we then learn internal priors from the given noisy image to refine the prior model. The external and internal priors are formulated as a set of orthogonal dictionaries to efficiently reconstruct the desired image. Extensive experiments are performed on several real-world noisy image datasets. The proposed method demonstrates highly competitive denoising performance, outperforming state-of-the-art denoising methods including those designed for real-world noisy images. |
Tasks | Denoising, Image Denoising |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.04505v2 |
http://arxiv.org/pdf/1705.04505v2.pdf | |
PWC | https://paperswithcode.com/paper/external-prior-guided-internal-prior-learning |
Repo | |
Framework | |
A Cascaded Convolutional Neural Network for X-ray Low-dose CT Image Denoising
Title | A Cascaded Convolutional Neural Network for X-ray Low-dose CT Image Denoising |
Authors | Dufan Wu, Kyungsang Kim, Georges El Fakhri, Quanzheng Li |
Abstract | Image denoising techniques are essential to reducing noise levels and enhancing diagnosis reliability in low-dose computed tomography (CT). Machine learning based denoising methods have shown great potential in removing the complex and spatial-variant noises in CT images. However, some residue artifacts would appear in the denoised image due to complexity of noises. A cascaded training network was proposed in this work, where the trained CNN was applied on the training dataset to initiate new trainings and remove artifacts induced by denoising. A cascades of convolutional neural networks (CNN) were built iteratively to achieve better performance with simple CNN structures. Experiments were carried out on 2016 Low-dose CT Grand Challenge datasets to evaluate the method’s performance. |
Tasks | Computed Tomography (CT), Denoising, Image Denoising |
Published | 2017-05-11 |
URL | http://arxiv.org/abs/1705.04267v2 |
http://arxiv.org/pdf/1705.04267v2.pdf | |
PWC | https://paperswithcode.com/paper/a-cascaded-convolutional-neural-network-for-x |
Repo | |
Framework | |
Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection
Title | Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection |
Authors | Mahyar Najibi, Fan Yang, Qiaosong Wang, Robinson Piramuthu |
Abstract | In this work, we propose an efficient and effective approach for unconstrained salient object detection in images using deep convolutional neural networks. Instead of generating thousands of candidate bounding boxes and refining them, our network directly learns to generate the saliency map containing the exact number of salient objects. During training, we convert the ground-truth rectangular boxes to Gaussian distributions that better capture the ROI regarding individual salient objects. During inference, the network predicts Gaussian distributions centered at salient objects with an appropriate covariance, from which bounding boxes are easily inferred. Notably, our network performs saliency map prediction without pixel-level annotations, salient object detection without object proposals, and salient object subitizing simultaneously, all in a single pass within a unified framework. Extensive experiments show that our approach outperforms existing methods on various datasets by a large margin, and achieves more than 100 fps with VGG16 network on a single GPU during inference. |
Tasks | Object Detection, Salient Object Detection |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1708.00079v2 |
http://arxiv.org/pdf/1708.00079v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-the-success-rate-of-one-real-time |
Repo | |
Framework | |
Personalized Age Progression with Bi-level Aging Dictionary Learning
Title | Personalized Age Progression with Bi-level Aging Dictionary Learning |
Authors | Xiangbo Shu, Jinhui Tang, Zechao Li, Hanjiang Lai, Liyan Zhang, Shuicheng Yan |
Abstract | Age progression is defined as aesthetically re-rendering the aging face at any future age for an individual face. In this work, we aim to automatically render aging faces in a personalized way. Basically, for each age group, we learn an aging dictionary to reveal its aging characteristics (e.g., wrinkles), where the dictionary bases corresponding to the same index yet from two neighboring aging dictionaries form a particular aging pattern cross these two age groups, and a linear combination of all these patterns expresses a particular personalized aging process. Moreover, two factors are taken into consideration in the dictionary learning process. First, beyond the aging dictionaries, each person may have extra personalized facial characteristics, e.g. mole, which are invariant in the aging process. Second, it is challenging or even impossible to collect faces of all age groups for a particular person, yet much easier and more practical to get face pairs from neighboring age groups. To this end, we propose a novel Bi-level Dictionary Learning based Personalized Age Progression (BDL-PAP) method. Here, bi-level dictionary learning is formulated to learn the aging dictionaries based on face pairs from neighboring age groups. Extensive experiments well demonstrate the advantages of the proposed BDL-PAP over other state-of-the-arts in term of personalized age progression, as well as the performance gain for cross-age face verification by synthesizing aging faces. |
Tasks | Dictionary Learning, Face Verification |
Published | 2017-06-04 |
URL | http://arxiv.org/abs/1706.01039v1 |
http://arxiv.org/pdf/1706.01039v1.pdf | |
PWC | https://paperswithcode.com/paper/personalized-age-progression-with-bi-level |
Repo | |
Framework | |
Phone-aware Neural Language Identification
Title | Phone-aware Neural Language Identification |
Authors | Zhiyuan Tang, Dong Wang, Yixiang Chen, Ying Shi, Lantian Li |
Abstract | Pure acoustic neural models, particularly the LSTM-RNN model, have shown great potential in language identification (LID). However, the phonetic information has been largely overlooked by most of existing neural LID models, although this information has been used in the conventional phonetic LID systems with a great success. We present a phone-aware neural LID architecture, which is a deep LSTM-RNN LID system but accepts output from an RNN-based ASR system. By utilizing the phonetic knowledge, the LID performance can be significantly improved. Interestingly, even if the test language is not involved in the ASR training, the phonetic knowledge still presents a large contribution. Our experiments conducted on four languages within the Babel corpus demonstrated that the phone-aware approach is highly effective. |
Tasks | Language Identification |
Published | 2017-05-09 |
URL | http://arxiv.org/abs/1705.03152v2 |
http://arxiv.org/pdf/1705.03152v2.pdf | |
PWC | https://paperswithcode.com/paper/phone-aware-neural-language-identification |
Repo | |
Framework | |
An Adaptive Strategy for Active Learning with Smooth Decision Boundary
Title | An Adaptive Strategy for Active Learning with Smooth Decision Boundary |
Authors | Andrea Locatelli, Alexandra Carpentier, Samory Kpotufe |
Abstract | We present the first adaptive strategy for active learning in the setting of classification with smooth decision boundary. The problem of adaptivity (to unknown distributional parameters) has remained opened since the seminal work of Castro and Nowak (2007), which first established (active learning) rates for this setting. While some recent advances on this problem establish adaptive rates in the case of univariate data, adaptivity in the more practical setting of multivariate data has so far remained elusive. Combining insights from various recent works, we show that, for the multivariate case, a careful reduction to univariate-adaptive strategies yield near-optimal rates without prior knowledge of distributional parameters. |
Tasks | Active Learning |
Published | 2017-11-25 |
URL | http://arxiv.org/abs/1711.09294v1 |
http://arxiv.org/pdf/1711.09294v1.pdf | |
PWC | https://paperswithcode.com/paper/an-adaptive-strategy-for-active-learning-with |
Repo | |
Framework | |
How compatible are our discourse annotations? Insights from mapping RST-DT and PDTB annotations
Title | How compatible are our discourse annotations? Insights from mapping RST-DT and PDTB annotations |
Authors | Vera Demberg, Fatemeh Torabi Asr, Merel Scholman |
Abstract | Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks. This makes comparison of the annotations difficult, thereby also preventing researchers from searching the corpora in a unified way, or using all annotated data jointly to train computational systems. Several theoretical proposals have recently been made for mapping the relational labels of different frameworks to each other, but these proposals have so far not been validated against existing annotations. The two largest discourse relation annotated resources, the Penn Discourse Treebank and the Rhetorical Structure Theory Discourse Treebank, have however been annotated on the same text, allowing for a direct comparison of the annotation layers. We propose a method for automatically aligning the discourse segments, and then evaluate existing mapping proposals by comparing the empirically observed against the proposed mappings. Our analysis highlights the influence of segmentation on subsequent discourse relation labeling, and shows that while agreement between frameworks is reasonable for explicit relations, agreement on implicit relations is low. We identify several sources of systematic discrepancies between the two annotation schemes and discuss consequences of these discrepancies for future annotation and for the training of automatic discourse relation labellers. |
Tasks | |
Published | 2017-04-28 |
URL | http://arxiv.org/abs/1704.08893v2 |
http://arxiv.org/pdf/1704.08893v2.pdf | |
PWC | https://paperswithcode.com/paper/how-compatible-are-our-discourse-annotations |
Repo | |
Framework | |
Toward Open-Set Face Recognition
Title | Toward Open-Set Face Recognition |
Authors | Manuel Günther, Steve Cruz, Ethan M. Rudd, Terrance E. Boult |
Abstract | Much research has been conducted on both face identification and face verification, with greater focus on the latter. Research on face identification has mostly focused on using closed-set protocols, which assume that all probe images used in evaluation contain identities of subjects that are enrolled in the gallery. Real systems, however, where only a fraction of probe sample identities are enrolled in the gallery, cannot make this closed-set assumption. Instead, they must assume an open set of probe samples and be able to reject/ignore those that correspond to unknown identities. In this paper, we address the widespread misconception that thresholding verification-like scores is a good way to solve the open-set face identification problem, by formulating an open-set face identification protocol and evaluating different strategies for assessing similarity. Our open-set identification protocol is based on the canonical labeled faces in the wild (LFW) dataset. Additionally to the known identities, we introduce the concepts of known unknowns (known, but uninteresting persons) and unknown unknowns (people never seen before) to the biometric community. We compare three algorithms for assessing similarity in a deep feature space under an open-set protocol: thresholded verification-like scores, linear discriminant analysis (LDA) scores, and an extreme value machine (EVM) probabilities. Our findings suggest that thresholding EVM probabilities, which are open-set by design, outperforms thresholding verification-like scores. |
Tasks | Face Identification, Face Recognition, Face Verification |
Published | 2017-05-03 |
URL | http://arxiv.org/abs/1705.01567v2 |
http://arxiv.org/pdf/1705.01567v2.pdf | |
PWC | https://paperswithcode.com/paper/toward-open-set-face-recognition |
Repo | |
Framework | |
Robust Saliency Detection via Fusing Foreground and Background Priors
Title | Robust Saliency Detection via Fusing Foreground and Background Priors |
Authors | Kan Huang, Chunbiao Zhu, Ge Li |
Abstract | Automatic Salient object detection has received tremendous attention from research community and has been an increasingly important tool in many computer vision tasks. This paper proposes a novel bottom-up salient object detection framework which considers both foreground and background cues. First, A series of background and foreground seeds are selected from an image reliably, and then used for calculation of saliency map separately. Next, a combination of foreground and background saliency map is performed. Last, a refinement step based on geodesic distance is utilized to enhance salient regions, thus deriving the final saliency map. Particularly we provide a robust scheme for seeds selection which contributes a lot to accuracy improvement in saliency detection. Extensive experimental evaluations demonstrate the effectiveness of our proposed method against other outstanding methods. |
Tasks | Object Detection, Saliency Detection, Salient Object Detection |
Published | 2017-11-01 |
URL | http://arxiv.org/abs/1711.00322v1 |
http://arxiv.org/pdf/1711.00322v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-saliency-detection-via-fusing |
Repo | |
Framework | |
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning
Title | Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning |
Authors | Yang Xian, Yingli Tian |
Abstract | In this paper, a self-guiding multimodal LSTM (sg-LSTM) image captioning model is proposed to handle uncontrolled imbalanced real-world image-sentence dataset. We collect FlickrNYC dataset from Flickr as our testbed with 306,165 images and the original text descriptions uploaded by the users are utilized as the ground truth for training. Descriptions in FlickrNYC dataset vary dramatically ranging from short term-descriptions to long paragraph-descriptions and can describe any visual aspects, or even refer to objects that are not depicted. To deal with the imbalanced and noisy situation and to fully explore the dataset itself, we propose a novel guiding textual feature extracted utilizing a multimodal LSTM (m-LSTM) model. Training of m-LSTM is based on the portion of data in which the image content and the corresponding descriptions are strongly bonded. Afterwards, during the training of sg-LSTM on the rest training data, this guiding information serves as additional input to the network along with the image representations and the ground-truth descriptions. By integrating these input components into a multimodal block, we aim to form a training scheme with the textual information tightly coupled with the image content. The experimental results demonstrate that the proposed sg-LSTM model outperforms the traditional state-of-the-art multimodal RNN captioning framework in successfully describing the key components of the input images. |
Tasks | Image Captioning |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05038v1 |
http://arxiv.org/pdf/1709.05038v1.pdf | |
PWC | https://paperswithcode.com/paper/self-guiding-multimodal-lstm-when-we-do-not |
Repo | |
Framework | |
Artificial Error Generation with Machine Translation and Syntactic Patterns
Title | Artificial Error Generation with Machine Translation and Syntactic Patterns |
Authors | Marek Rei, Mariano Felice, Zheng Yuan, Ted Briscoe |
Abstract | Shortage of available training data is holding back progress in the area of automated error detection. This paper investigates two alternative methods for artificially generating writing errors, in order to create additional resources. We propose treating error generation as a machine translation task, where grammatically correct text is translated to contain errors. In addition, we explore a system for extracting textual patterns from an annotated corpus, which can then be used to insert errors into grammatically correct sentences. Our experiments show that the inclusion of artificially generated errors significantly improves error detection accuracy on both FCE and CoNLL 2014 datasets. |
Tasks | Grammatical Error Detection, Machine Translation |
Published | 2017-07-17 |
URL | http://arxiv.org/abs/1707.05236v1 |
http://arxiv.org/pdf/1707.05236v1.pdf | |
PWC | https://paperswithcode.com/paper/artificial-error-generation-with-machine |
Repo | |
Framework | |
Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World
Title | Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World |
Authors | Maria De-Arteaga, William Herlands |
Abstract | This is the Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World, held in Long Beach, California, USA on December 8, 2017 |
Tasks | |
Published | 2017-11-27 |
URL | http://arxiv.org/abs/1711.09522v2 |
http://arxiv.org/pdf/1711.09522v2.pdf | |
PWC | https://paperswithcode.com/paper/proceedings-of-nips-2017-workshop-on-machine |
Repo | |
Framework | |
Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics
Title | Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics |
Authors | Saiprasad Koturwar, Shabbir Merchant |
Abstract | Deep neural networks (DNNs) form the backbone of almost every state-of-the-art technique in the fields such as computer vision, speech processing, and text analysis. The recent advances in computational technology have made the use of DNNs more practical. Despite the overwhelming performances by DNN and the advances in computational technology, it is seen that very few researchers try to train their models from the scratch. Training of DNNs still remains a difficult and tedious job. The main challenges that researchers face during training of DNNs are the vanishing/exploding gradient problem and the highly non-convex nature of the objective function which has up to million variables. The approaches suggested in He and Xavier solve the vanishing gradient problem by providing a sophisticated initialization technique. These approaches have been quite effective and have achieved good results on standard datasets, but these same approaches do not work very well on more practical datasets. We think the reason for this is not making use of data statistics for initializing the network weights. Optimizing such a high dimensional loss function requires careful initialization of network weights. In this work, we propose a data dependent initialization and analyze its performance against the standard initialization techniques such as He and Xavier. We performed our experiments on some practical datasets and the results show our algorithm’s superior classification accuracy. |
Tasks | |
Published | 2017-10-29 |
URL | http://arxiv.org/abs/1710.10570v2 |
http://arxiv.org/pdf/1710.10570v2.pdf | |
PWC | https://paperswithcode.com/paper/weight-initialization-of-deep-neural |
Repo | |
Framework | |
Keypoint-based object tracking and localization using networks of low-power embedded smart cameras
Title | Keypoint-based object tracking and localization using networks of low-power embedded smart cameras |
Authors | Ibrahim Abdelkader, Yasser El-Sonbaty, Mohamed El-Habrouk |
Abstract | Object tracking and localization is a complex task that typically requires processing power beyond the capabilities of low-power embedded cameras. This paper presents a new approach to real-time object tracking and localization using multi-view binary keypoints descriptor. The proposed approach offers a compromise between processing power, accuracy and networking bandwidth and has been tested using multiple distributed low-power smart cameras. Additionally, multiple optimization techniques are presented to improve the performance of the keypoints descriptor for low-power embedded systems. |
Tasks | Object Tracking |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1712.01635v1 |
http://arxiv.org/pdf/1712.01635v1.pdf | |
PWC | https://paperswithcode.com/paper/keypoint-based-object-tracking-and |
Repo | |
Framework | |
Is a Data-Driven Approach still Better than Random Choice with Naive Bayes classifiers?
Title | Is a Data-Driven Approach still Better than Random Choice with Naive Bayes classifiers? |
Authors | Piotr Szymański, Tomasz Kajdanowicz |
Abstract | We study the performance of data-driven, a priori and random approaches to label space partitioning for multi-label classification with a Gaussian Naive Bayes classifier. Experiments were performed on 12 benchmark data sets and evaluated on 5 established measures of classification quality: micro and macro averaged F1 score, Subset Accuracy and Hamming loss. Data-driven methods are significantly better than an average run of the random baseline. In case of F1 scores and Subset Accuracy - data driven approaches were more likely to perform better than random approaches than otherwise in the worst case. There always exists a method that performs better than a priori methods in the worst case. The advantage of data-driven methods against a priori methods with a weak classifier is lesser than when tree classifiers are used. |
Tasks | Multi-Label Classification |
Published | 2017-02-13 |
URL | http://arxiv.org/abs/1702.04013v1 |
http://arxiv.org/pdf/1702.04013v1.pdf | |
PWC | https://paperswithcode.com/paper/is-a-data-driven-approach-still-better-than |
Repo | |
Framework | |