Paper Group NAWR 14
User-Assisted Shadow Removal. Neural Separation of Observed and Unobserved Distribution. Fast Video Object Segmentation by Reference-Guided Mask Propagation. Neural Machine Translation for English-Tamil. CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization. Auto-hMDS: Automatic Construction of a Large Heterogeneous …
User-Assisted Shadow Removal
Title | User-Assisted Shadow Removal |
Authors | Han Gong; Darren Cosker |
Abstract | This paper presents a novel user-aided method for texture-preserving shadow removal from single images requiring simple user input. Compared with the state-of-the-art, our algorithm offers the most flexible user interaction to date and produces more accurate and robust shadow removal under thorough quantitative evaluation. Shadow masks are first detected by analysing user specified shadow feature strokes. Sample intensity profiles with variable interval and length around the shadow boundary are detected next, which avoids artefacts raised from uneven boundaries. Texture noise in samples is then removed by applying local group bilateral filtering, and initial sparse shadow scales are estimated by fitting a piecewise curve to intensity samples. The remaining errors in estimated sparse scales are removed by local group smoothing. To relight the image, a dense scale field is produced by in-painting the sparse scales. Finally, a gradual colour correction is applied to remove artefacts due to image post-processing. Using state-of-the-art evaluation data, we quantitatively and qualitatively demonstrate our method to outperform current leading shadow removal methods. |
Tasks | |
Published | 2018-04-18 |
URL | https://www.sciencedirect.com/science/article/pii/S0262885617300744?via%3Dihub |
https://www.sciencedirect.com/science/article/pii/S0262885617300744?via%3Dihub | |
PWC | https://paperswithcode.com/paper/user-assisted-shadow-removal |
Repo | https://github.com/hangong/deshadow-curve_solution |
Framework | none |
Neural Separation of Observed and Unobserved Distribution
Title | Neural Separation of Observed and Unobserved Distribution |
Authors | Tavi Halperin, Ariel Ephrat, Yedid Hoshen |
Abstract | Separating mixed distributions is a long standing challenge for machine learning and signal processing. Most current methods either rely on making strong assumptions on the source distributions or rely on having training samples of each source in the mixture. In this work, we introduce a new method—Neural Egg Separation—to tackle the scenario of extracting a signal from an unobserved distribution additively mixed with a signal from an observed distribution. Our method iteratively learns to separate the known distribution from progressively finer estimates of the unknown distribution. In some settings, Neural Egg Separation is initialization sensitive, we therefore introduce Latent Mixture Masking which ensures a good initialization. Extensive experiments on audio and image separation tasks show that our method outperforms current methods that use the same level of supervision, and often achieves similar performance to full supervision. |
Tasks | |
Published | 2018-11-30 |
URL | https://arxiv.org/abs/1811.12739 |
https://arxiv.org/pdf/1811.12739.pdf | |
PWC | https://paperswithcode.com/paper/neural-separation-of-observed-and-unobserved-2 |
Repo | https://github.com/tavihalperin/Neural-Egg-Seperation |
Framework | pytorch |
Fast Video Object Segmentation by Reference-Guided Mask Propagation
Title | Fast Video Object Segmentation by Reference-Guided Mask Propagation |
Authors | Seoung Wug Oh, Joon-Young Lee, Kalyan Sunkavalli, Seon Joo Kim |
Abstract | We present an efficient method for the semi-supervised video object segmentation. Our method achieves accuracy competitive with state-of-the-art methods while running in a fraction of time compared to others. To this end, we propose a deep Siamese encoder-decoder network that is designed to take advantage of mask propagation and object detection while avoiding the weaknesses of both approaches. Our network, learned through a two-stage training process that exploits both synthetic and real data, works robustly without any online learning or post-processing. We validate our method on four benchmark sets that cover single and multiple object segmentation. On all the benchmark sets, our method shows comparable accuracy while having the order of magnitude faster runtime. We also provide extensive ablation and add-on studies to analyze and evaluate our framework. |
Tasks | Object Detection, Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Oh_Fast_Video_Object_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Oh_Fast_Video_Object_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/fast-video-object-segmentation-by-reference |
Repo | https://github.com/xanderchf/RGMP |
Framework | pytorch |
Neural Machine Translation for English-Tamil
Title | Neural Machine Translation for English-Tamil |
Authors | Himanshu Choudhary, Aditya Kumar Pathak, Rajiv Ratan Saha, Ponnurangam Kumaraguru |
Abstract | A huge amount of valuable resources is available on the web in English, which are often translated into local languages to facilitate knowledge sharing among local people who are not much familiar with English. However, translating such content manually is very tedious, costly, and time-consuming process. To this end, machine translation is an efficient approach to translate text without any human involvement. Neural machine translation (NMT) is one of the most recent and effective translation technique amongst all existing machine translation systems. In this paper, we apply NMT for English-Tamil language pair. We propose a novel neural machine translation technique using word-embedding along with Byte-Pair-Encoding (BPE) to develop an efficient translation system that overcomes the OOV (Out Of Vocabulary) problem for languages which do not have much translations available online. We use the BLEU score for evaluating the system performance. Experimental results confirm that our proposed MIDAS translator (8.33 BLEU score) outperforms Google translator (3.75 BLEU score). |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6459/ |
https://www.aclweb.org/anthology/W18-6459 | |
PWC | https://paperswithcode.com/paper/neural-machine-translation-for-english-tamil |
Repo | https://github.com/precog-iiitd/MIDAS-NMT-English-Tamil |
Framework | none |
CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
Title | CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization |
Authors | Sixing Hu, Mengdan Feng, Rang M. H. Nguyen, Gim Hee Lee |
Abstract | The problem of localization on a geo-referenced aerial/satellite map given a query ground view image remains challenging due to the drastic change in viewpoint that causes traditional image descriptors based matching to fail. We leverage on the recent success of deep learning to propose the CVM-Net for the cross-view image-based ground-to-aerial geo-localization task. Specifically, our network is based on the Siamese architecture to do metric learning for the matching task. We first use the fully convolutional layers to extract local image features, which are then encoded into global image descriptors using the powerful NetVLAD. As part of the training procedure, we also introduce a simple yet effective weighted soft margin ranking loss function that not only speeds up the training convergence but also improves the final matching accuracy. Experimental results show that our proposed network significantly outperforms the state-of-the-art approaches on two existing benchmarking datasets. |
Tasks | Metric Learning |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Hu_CVM-Net_Cross-View_Matching_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Hu_CVM-Net_Cross-View_Matching_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/cvm-net-cross-view-matching-network-for-image |
Repo | https://github.com/david-husx/crossview_localisation |
Framework | tf |
Auto-hMDS: Automatic Construction of a Large Heterogeneous Multilingual Multi-Document Summarization Corpus
Title | Auto-hMDS: Automatic Construction of a Large Heterogeneous Multilingual Multi-Document Summarization Corpus |
Authors | Markus Zopf |
Abstract | |
Tasks | Abstractive Text Summarization, Document Summarization, Machine Translation, Multi-Document Summarization, Question Answering, Text Summarization |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1510/ |
https://www.aclweb.org/anthology/L18-1510 | |
PWC | https://paperswithcode.com/paper/auto-hmds-automatic-construction-of-a-large |
Repo | https://github.com/AIPHES/auto-hMDS |
Framework | none |
Cartman: The low-cost Cartesian Manipulator that won the Amazon Robotics Challenge
Title | Cartman: The low-cost Cartesian Manipulator that won the Amazon Robotics Challenge |
Authors | D. Morrison, A.W. Tow, M. McTaggart, R. Smith, N. Kelly-Boxall, S. Wade-McCue, J. Erskine, R. Grinover, A. Gurman, T. Hunn, D. Lee, A. Milan, T. Pham, G. Rallos, A. Razjigaev, T. Rowntree, K. Vijay, Z. Zhuang, C. Lehnert, I. Reid, P. Corke, J. Leitner |
Abstract | The Amazon Robotics Challenge enlisted sixteen teams to each design a pick-and-place robot for autonomous warehousing, addressing development in robotic vision and manipulation. This paper presents the design of our custom-built, cost-effective, Cartesian robot system Cartman, which won first place in the competition finals by stowing 14 (out of 16) and picking all 9 items in 27 minutes, scoring a total of 272 points. We highlight our experience-centred design methodology and key aspects of our system that contributed to our competitiveness. We believe these aspects are crucial to building robust and effective robotic systems. |
Tasks | Robotic Grasping |
Published | 2018-02-28 |
URL | https://arxiv.org/abs/1709.06283 |
https://arxiv.org/pdf/1709.06283 | |
PWC | https://paperswithcode.com/paper/cartman-the-low-cost-cartesian-manipulator |
Repo | https://github.com/warehouse-picking-automation-challenges/team_acrv_2017 |
Framework | none |
Deep Neural Networks with Box Convolutions
Title | Deep Neural Networks with Box Convolutions |
Authors | Egor Burkov, Victor Lempitsky |
Abstract | Box filters computed using integral images have been part of the computer vision toolset for a long time. Here, we show that a convolutional layer that computes box filter responses in a sliding manner can be used within deep architectures, whereas the dimensions and the offsets of the sliding boxes in such a layer can be learned as part of an end-to-end loss minimization. Crucially, the training process can make the size of the boxes in such a layer arbitrarily large without incurring extra computational cost and without the need to increase the number of learnable parameters. Due to its ability to integrate information over large boxes, the new layer facilitates long-range propagation of information and leads to the efficient increase of the receptive fields of downstream units in the network. By incorporating the new layer into existing architectures for semantic segmentation, we are able to achieve both the increase in segmentation accuracy as well as the decrease in the computational cost and the number of learnable parameters. |
Tasks | Semantic Segmentation |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7859-deep-neural-networks-with-box-convolutions |
http://papers.nips.cc/paper/7859-deep-neural-networks-with-box-convolutions.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-with-box-convolutions |
Repo | https://github.com/shrubb/box-convolutions |
Framework | pytorch |
Cross-lingual Lexical Sememe Prediction
Title | Cross-lingual Lexical Sememe Prediction |
Authors | Fanchao Qi, Yankai Lin, Maosong Sun, Hao Zhu, Ruobing Xie, Zhiyuan Liu |
Abstract | Sememes are defined as the minimum semantic units of human languages. As important knowledge sources, sememe-based linguistic knowledge bases have been widely used in many NLP tasks. However, most languages still do not have sememe-based linguistic knowledge bases. Thus we present a task of cross-lingual lexical sememe prediction, aiming to automatically predict sememes for words in other languages. We propose a novel framework to model correlations between sememes and multi-lingual words in low-dimensional semantic space for sememe prediction. Experimental results on real-world datasets show that our proposed model achieves consistent and significant improvements as compared to baseline methods in cross-lingual sememe prediction. The codes and data of this paper are available at \url{https://github.com/thunlp/CL-SP}. |
Tasks | cross-lingual sememe prediction, Learning Word Embeddings, Multilingual Word Embeddings, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1033/ |
https://www.aclweb.org/anthology/D18-1033 | |
PWC | https://paperswithcode.com/paper/cross-lingual-lexical-sememe-prediction |
Repo | https://github.com/thunlp/CL-SP |
Framework | none |
Unsupervised Mining of Analogical Frames by Constraint Satisfaction
Title | Unsupervised Mining of Analogical Frames by Constraint Satisfaction |
Authors | Lance De Vine, Shlomo Geva, Peter Bruza |
Abstract | It has been demonstrated that vector-based representations of words trained on large text corpora encode linguistic regularities that may be exploited via the use of vector space arithmetic. This capability has been extensively explored and is generally measured via tasks which involve the automated completion of linguistic proportional analogies. The question remains, however, as to what extent it is possible to induce relations from word embeddings in a principled and systematic way, without the provision of exemplars or seed terms. In this paper we propose an extensible and efficient framework for inducing relations via the use of constraint satisfaction. The method is efficient, unsupervised and can be customized in various ways. We provide both quantitative and qualitative analysis of the results. |
Tasks | Language Acquisition, Machine Translation, Word Embeddings |
Published | 2018-12-01 |
URL | https://www.aclweb.org/anthology/U18-1004/ |
https://www.aclweb.org/anthology/U18-1004 | |
PWC | https://paperswithcode.com/paper/unsupervised-mining-of-analogical-frames-by |
Repo | https://github.com/ldevine/AFM |
Framework | none |
Gold Standard Annotations for Preposition and Verb Sense with Semantic Role Labels in Adult-Child Interactions
Title | Gold Standard Annotations for Preposition and Verb Sense with Semantic Role Labels in Adult-Child Interactions |
Authors | Lori Moon, Christos Christodoulopoulos, Cynthia Fisher, S Franco, ra, Dan Roth |
Abstract | This paper describes the augmentation of an existing corpus of child-directed speech. The resulting corpus is a gold-standard labeled corpus for supervised learning of semantic role labels in adult-child dialogues. Semantic role labeling (SRL) models assign semantic roles to sentence constituents, thus indicating who has done what to whom (and in what way). The current corpus is derived from the Adam files in the Brown corpus (Brown 1973) of the CHILDES corpora, and augments the partial annotation described in Connor et al. (2010). It provides labels for both semantic arguments of verbs and semantic arguments of prepositions. The semantic role labels and senses of verbs follow Propbank guidelines Kingsbury and Palmer, 2002; Gildea and Palmer 2002; Palmer et al., 2005) and those for prepositions follow Srikumar and Roth (2011). The corpus was annotated by two annotators. Inter-annotator agreement is given separately for prepositions and verbs, and for adult speech and child speech. Overall, across child and adult samples, including verbs and prepositions, the kappa score for sense is 72.6, for the number of semantic-role-bearing arguments, the kappa score is 77.4, for identical semantic role labels on a given argument, the kappa score is 91.1, for the span of semantic role labels, and the kappa for agreement is 93.9. The sense and number of arguments was often open to multiple interpretations in child speech, due to the rapidly changing discourse and omission of constituents in production. Annotators used a discourse context window of ten sentences before and ten sentences after the target utterance to determine the annotation labels. The derived corpus is available for use in CHAT (MacWhinney, 2000) and XML format. |
Tasks | Language Acquisition, Part-Of-Speech Tagging, Semantic Role Labeling |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1254/ |
https://www.aclweb.org/anthology/C18-1254 | |
PWC | https://paperswithcode.com/paper/gold-standard-annotations-for-preposition-and |
Repo | https://github.com/CogComp/child-discourse-SRL |
Framework | none |
Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings
Title | Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings |
Authors | Aviral Kumar, Sunita Sarawagi, Ujjwal Jain |
Abstract | Modern neural networks have recently been found to be poorly calibrated, primarily in the direction of over-confidence. Methods like entropy penalty and temperature smoothing improve calibration by clamping confidence, but in doing so compromise the many legitimately confident predictions. We propose a more principled fix that minimizes an explicit calibration error during training. We present MMCE, a RKHS kernel based measure of calibration that is efficiently trainable alongside the negative likelihood loss without careful hyper-parameter tuning. Theoretically too, MMCE is a sound measure of calibration that is minimized at perfect calibration, and whose finite sample estimates are consistent and enjoy fast convergence rates. Extensive experiments on several network architectures demonstrate that MMCE is a fast, stable, and accurate method to minimize calibration error while maximally preserving the number of high confidence predictions. |
Tasks | Calibration |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2326 |
http://proceedings.mlr.press/v80/kumar18a/kumar18a.pdf | |
PWC | https://paperswithcode.com/paper/trainable-calibration-measures-for-neural |
Repo | https://github.com/aviralkumar2907/MMCE |
Framework | tf |
Where is Your Evidence: Improving Fact-checking by Justification Modeling
Title | Where is Your Evidence: Improving Fact-checking by Justification Modeling |
Authors | Tariq Alhindi, Savvas Petridis, Smar Muresan, a |
Abstract | Fact-checking is a journalistic practice that compares a claim made publicly against trusted sources of facts. Wang (2017) introduced a large dataset of validated claims from the POLITIFACT.com website (LIAR dataset), enabling the development of machine learning approaches for fact-checking. However, approaches based on this dataset have focused primarily on modeling the claim and speaker-related metadata, without considering the evidence used by humans in labeling the claims. We extend the LIAR dataset by automatically extracting the justification from the fact-checking article used by humans to label a given claim. We show that modeling the extracted justification in conjunction with the claim (and metadata) provides a significant improvement regardless of the machine learning model used (feature-based or deep learning) both in a binary classification task (true, false) and in a six-way classification task (pants on fire, false, mostly false, half true, mostly true, true). |
Tasks | Argument Mining, Emotion Recognition |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5513/ |
https://www.aclweb.org/anthology/W18-5513 | |
PWC | https://paperswithcode.com/paper/where-is-your-evidence-improving-fact |
Repo | https://github.com/ekagra-ranjan/fake-news-detection-LIAR-pytorch |
Framework | pytorch |
High Performance Visual Tracking With Siamese Region Proposal Network
Title | High Performance Visual Tracking With Siamese Region Proposal Network |
Authors | Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, Xiaolin Hu |
Abstract | Visual object tracking has been a fundamental topic in recent years and many deep learning based trackers have achieved state-of-the-art performance on multiple benchmarks. However, most of these trackers can hardly get top performance with real-time speed. In this paper, we propose the Siamese region proposal network (Siamese-RPN) which is end-to-end trained off-line with large-scale image pairs. Specifically, it consists of Siamese subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch. In the inference phase, the proposed framework is formulated as a local one-shot detection task. We can pre-compute the template branch of the Siamese subnetwork and formulate the correlation layers as trivial convolution layers to perform online tracking. Benefit from the proposal refinement, traditional multi-scale test and online fine-tuning can be discarded. The Siamese-RPN runs at 160 FPS while achieving leading performance in VOT2015, VOT2016 and VOT2017 real-time challenges. |
Tasks | Object Tracking, Visual Object Tracking, Visual Tracking |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Li_High_Performance_Visual_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_High_Performance_Visual_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/high-performance-visual-tracking-with-siamese |
Repo | https://github.com/zkisthebest/Siamese-RPN |
Framework | pytorch |
Context Based Approach for Second Language Acquisition
Title | Context Based Approach for Second Language Acquisition |
Authors | Nihal V. Nayak, Arjun R. Rao |
Abstract | SLAM 2018 focuses on predicting a student{'}s mistake while using the Duolingo application. In this paper, we describe the system we developed for this shared task. Our system uses a logistic regression model to predict the likelihood of a student making a mistake while answering an exercise on Duolingo in all three language tracks - English/Spanish (en/es), Spanish/English (es/en) and French/English (fr/en). We conduct an ablation study with several features during the development of this system and discover that context based features plays a major role in language acquisition modeling. Our model beats Duolingo{'}s baseline scores in all three language tracks (AUROC scores for en/es = 0.821, es/en = 0.790 and fr/en = 0.812). Our work makes a case for providing favourable textual context for students while learning second language. |
Tasks | Language Acquisition |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0524/ |
https://www.aclweb.org/anthology/W18-0524 | |
PWC | https://paperswithcode.com/paper/context-based-approach-for-second-language |
Repo | https://github.com/arjun-rao/slam18 |
Framework | none |