Paper Group ANR 1270
Character 3-gram Mover’s Distance: An Effective Method for Detecting Near-duplicate Japanese-language Recipes. Histographs: Graphs in Histopathology. Weakly Supervised Object Localization with Inter-Intra Regulated CAMs. A Short Note on the Kinetics-700 Human Action Dataset. Benefit of Interpolation in Nearest Neighbor Algorithms. Document-level Ne …
Character 3-gram Mover’s Distance: An Effective Method for Detecting Near-duplicate Japanese-language Recipes
Title | Character 3-gram Mover’s Distance: An Effective Method for Detecting Near-duplicate Japanese-language Recipes |
Authors | Masaki Oguni, Yohei Seki, Yu Hirate |
Abstract | In user-generated recipe websites, users post their-original recipes. Some recipes, however, are very similar in major components such as the cooking instructions to other recipes. We refer to such recipes as “near-duplicate recipes”. In this study, we propose a method that extends the “Word Mover’s Distance”, which calculates distances between texts based on word embedding, to character 3-gram embedding. Using a corpus of over 1.21 million recipes, we learned the word embedding and the character 3-gram embedding by using a Skip-Gram model with negative sampling and fastText to extract candidate pairs of near-duplicate recipes. We then annotated these candidates and evaluated the proposed method against a comparison method. Our results demonstrated that near-duplicate recipes that were not detected by the comparison method were successfully detected by the proposed method. |
Tasks | |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05171v2 |
https://arxiv.org/pdf/1912.05171v2.pdf | |
PWC | https://paperswithcode.com/paper/character-3-gram-movers-distance-an-effective |
Repo | |
Framework | |
Histographs: Graphs in Histopathology
Title | Histographs: Graphs in Histopathology |
Authors | Shrey Gadiya, Deepak Anand, Amit Sethi |
Abstract | Spatial arrangement of cells of various types, such as tumor infiltrating lymphocytes and the advancing edge of a tumor, are important features for detecting and characterizing cancers. However, convolutional neural networks (CNNs) do not explicitly extract intricate features of the spatial arrangements of the cells from histopathology images. In this work, we propose to classify cancers using graph convolutional networks (GCNs) by modeling a tissue section as a multi-attributed spatial graph of its constituent cells. Cells are detected using their nuclei in H&E stained tissue image, and each cell’s appearance is captured as a multi-attributed high-dimensional vertex feature. The spatial relations between neighboring cells are captured as edge features based on their distances in a graph. We demonstrate the utility of this approach by obtaining classification accuracy that is competitive with CNNs, specifically, Inception-v3, on two tasks-cancerous versus non-cancerous and in situ versus invasive-on the BACH breast cancer dataset. |
Tasks | |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05020v1 |
https://arxiv.org/pdf/1908.05020v1.pdf | |
PWC | https://paperswithcode.com/paper/histographs-graphs-in-histopathology |
Repo | |
Framework | |
Weakly Supervised Object Localization with Inter-Intra Regulated CAMs
Title | Weakly Supervised Object Localization with Inter-Intra Regulated CAMs |
Authors | Guofeng Cui, Ziyi Kou, Shaojie Wang, Wentian Zhao, Chenliang Xu |
Abstract | Weakly supervised object localization (WSOL) aims to locate objects in images by learning only from image-level labels. Current methods are trying to obtain localization results relying on Class Activation Maps (CAMs). Usually, they propose additional CAMs or feature maps generated from internal layers of deep networks to encourage different CAMs to be either \textbf{adversarial} or \textbf{cooperated} with each other. In this work, instead of following one of the two main approaches before, we analyze their internal relationship and propose a novel intra-sample strategy which regulates two CAMs of the same sample, generated from different classifiers, to dynamically adapt each of their pixels involved in adversarial or cooperative process based on their own values. We mathematically demonstrate that our approach is a more general version of the current state-of-the-art method with less hyper-parameters. Besides, we further develop an inter-sample criterion module for our WSOL task, which is originally proposed in co-segmentation problems, to refine generated CAMs of each sample. The module considers a subgroup of samples under the same category and regulates their object regions. With experiment on two widely-used datasets, we show that our proposed method significantly outperforms existing state-of-the-art, setting a new record for weakly-supervised object localization. |
Tasks | Object Localization, Weakly-Supervised Object Localization |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07160v2 |
https://arxiv.org/pdf/1911.07160v2.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-object-localization-with-2 |
Repo | |
Framework | |
A Short Note on the Kinetics-700 Human Action Dataset
Title | A Short Note on the Kinetics-700 Human Action Dataset |
Authors | Joao Carreira, Eric Noland, Chloe Hillier, Andrew Zisserman |
Abstract | We describe an extension of the DeepMind Kinetics human action dataset from 600 classes to 700 classes, where for each class there are at least 600 video clips from different YouTube videos. This paper details the changes introduced for this new release of the dataset, and includes a comprehensive set of statistics as well as baseline results using the I3D neural network architecture. |
Tasks | |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06987v1 |
https://arxiv.org/pdf/1907.06987v1.pdf | |
PWC | https://paperswithcode.com/paper/a-short-note-on-the-kinetics-700-human-action |
Repo | |
Framework | |
Benefit of Interpolation in Nearest Neighbor Algorithms
Title | Benefit of Interpolation in Nearest Neighbor Algorithms |
Authors | Yue Xing, Qifan Song, Guang Cheng |
Abstract | The over-parameterized models attract much attention in the era of data science and deep learning. It is empirically observed that although these models, e.g. deep neural networks, over-fit the training data, they can still achieve small testing error, and sometimes even {\em outperform} traditional algorithms which are designed to avoid over-fitting. The major goal of this work is to sharply quantify the benefit of data interpolation in the context of nearest neighbors (NN) algorithm. Specifically, we consider a class of interpolated weighting schemes and then carefully characterize their asymptotic performances. Our analysis reveals a U-shaped performance curve with respect to the level of data interpolation, and proves that a mild degree of data interpolation {\em strictly} improves the prediction accuracy and statistical stability over those of the (un-interpolated) optimal $k$NN algorithm. This theoretically justifies (predicts) the existence of the second U-shaped curve in the recently discovered double descent phenomenon. Note that our goal in this study is not to promote the use of interpolated-NN method, but to obtain theoretical insights on data interpolation inspired by the aforementioned phenomenon. |
Tasks | |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11720v1 |
https://arxiv.org/pdf/1909.11720v1.pdf | |
PWC | https://paperswithcode.com/paper/benefit-of-interpolation-in-nearest-neighbor |
Repo | |
Framework | |
Document-level Neural Machine Translation with Inter-Sentence Attention
Title | Document-level Neural Machine Translation with Inter-Sentence Attention |
Authors | Shu Jiang, Rui Wang, Zuchao Li, Masao Utiyama, Kehai Chen, Eiichiro Sumita, Hai Zhao, Bao-liang Lu |
Abstract | Standard neural machine translation (NMT) is on the assumption of document-level context independent. Most existing document-level NMT methods only focus on briefly introducing document-level information but fail to concern about selecting the most related part inside document context. The capacity of memory network for detecting the most relevant part of the current sentence from the memory provides a natural solution for the requirement of modeling document-level context by document-level NMT. In this work, we propose a Transformer NMT system with associated memory network (AMN) to both capture the document-level context and select the most salient part related to the concerned translation from the memory. Experiments on several tasks show that the proposed method significantly improves the NMT performance over strong Transformer baselines and other related studies. |
Tasks | Machine Translation |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14528v1 |
https://arxiv.org/pdf/1910.14528v1.pdf | |
PWC | https://paperswithcode.com/paper/document-level-neural-machine-translation-2 |
Repo | |
Framework | |
Pseudo-task Regularization for ConvNet Transfer Learning
Title | Pseudo-task Regularization for ConvNet Transfer Learning |
Authors | Yang Zhong, Atsuto Maki |
Abstract | This paper is about regularizing deep convolutional networks (ConvNets) based on an adaptive multi-objective framework for transfer learning with limited training data in the target domain. Recent advances of ConvNets regularization in this context are commonly due to the use of additional regularization objectives. They guide the training away from the target task using some concrete tasks. Unlike those related approaches, we report that an objective without a concrete goal can serve surprisingly well as a regularizer. In particular, we demonstrate Pseudo-task Regularization (PtR) which dynamically regularizes a network by simply attempting to regress image representations to a pseudo-target during fine-tuning. Through numerous experiments, the improvements on classification accuracy by PtR are shown greater or on a par to the recent state-of-the-art methods. These results also indicate a room for rethinking on the requirements for a regularizer, i.e., if specifically designed task for regularization is indeed a key ingredient. The contributions of this paper are: a) PtR provides an effective and efficient alternative for regularization without dependence on concrete tasks or extra data; b) desired strength of regularization effect in PtR is dynamically adjusted and maintained based on the gradient norms of the target objective and the pseudo-task. |
Tasks | Transfer Learning |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05997v1 |
https://arxiv.org/pdf/1908.05997v1.pdf | |
PWC | https://paperswithcode.com/paper/pseudo-task-regularization-for-convnet |
Repo | |
Framework | |
Combinational Class Activation Maps for Weakly Supervised Object Localization
Title | Combinational Class Activation Maps for Weakly Supervised Object Localization |
Authors | Seunghan Yang, Yoonhyung Kim, Youngeun Kim, Changick Kim |
Abstract | Weakly supervised object localization has recently attracted attention since it aims to identify both class labels and locations of objects by using image-level labels. Most previous methods utilize the activation map corresponding to the highest activation source. Exploiting only one activation map of the highest probability class is often biased into limited regions or sometimes even highlights background regions. To resolve these limitations, we propose to use activation maps, named combinational class activation maps (CCAM), which are linear combinations of activation maps from the highest to the lowest probability class. By using CCAM for localization, we suppress background regions to help highlighting foreground objects more accurately. In addition, we design the network architecture to consider spatial relationships for localizing relevant object regions. Specifically, we integrate non-local modules into an existing base network at both low- and high-level layers. Our final model, named non-local combinational class activation maps (NL-CCAM), obtains superior performance compared to previous methods on representative object localization benchmarks including ILSVRC 2016 and CUB-200-2011. Furthermore, we show that the proposed method has a great capability of generalization by visualizing other datasets. |
Tasks | Object Localization, Weakly-Supervised Object Localization |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05518v2 |
https://arxiv.org/pdf/1910.05518v2.pdf | |
PWC | https://paperswithcode.com/paper/combinational-class-activation-maps-for |
Repo | |
Framework | |
n-MeRCI: A new Metric to Evaluate the Correlation Between Predictive Uncertainty and True Error
Title | n-MeRCI: A new Metric to Evaluate the Correlation Between Predictive Uncertainty and True Error |
Authors | Michel Moukari, Loïc Simon, Sylvaine Picard, Frédéric Jurie |
Abstract | As deep learning applications are becoming more and more pervasive in robotics, the question of evaluating the reliability of inferences becomes a central question in the robotics community. This domain, known as predictive uncertainty, has come under the scrutiny of research groups developing Bayesian approaches adapted to deep learning such as Monte Carlo Dropout. Unfortunately, for the time being, the real goal of predictive uncertainty has been swept under the rug. Indeed, these approaches are solely evaluated in terms of raw performance of the network prediction, while the quality of their estimated uncertainty is not assessed. Evaluating such uncertainty prediction quality is especially important in robotics, as actions shall depend on the confidence in perceived information. In this context, the main contribution of this article is to propose a novel metric that is adapted to the evaluation of relative uncertainty assessment and directly applicable to regression with deep neural networks. To experimentally validate this metric, we evaluate it on a toy dataset and then apply it to the task of monocular depth estimation. |
Tasks | Depth Estimation, Monocular Depth Estimation |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07253v1 |
https://arxiv.org/pdf/1908.07253v1.pdf | |
PWC | https://paperswithcode.com/paper/n-merci-a-new-metric-to-evaluate-the |
Repo | |
Framework | |
Semi-Supervised Adversarial Monocular Depth Estimation
Title | Semi-Supervised Adversarial Monocular Depth Estimation |
Authors | Rongrong Ji, Ke Li, Yan Wang, Xiaoshuai Sun, Feng Guo, Xiaowei Guo, Yongjian Wu, Feiyue Huang, Jiebo Luo |
Abstract | In this paper, we address the problem of monocular depth estimation when only a limited number of training image-depth pairs are available. To achieve a high regression accuracy, the state-of-the-art estimation methods rely on CNNs trained with a large number of image-depth pairs, which are prohibitively costly or even infeasible to acquire. Aiming to break the curse of such expensive data collections, we propose a semi-supervised adversarial learning framework that only utilizes a small number of image-depth pairs in conjunction with a large number of easily-available monocular images to achieve high performance. In particular, we use one generator to regress the depth and two discriminators to evaluate the predicted depth , i.e., one inspects the image-depth pair while the other inspects the depth channel alone. These two discriminators provide their feedbacks to the generator as the loss to generate more realistic and accurate depth predictions. Experiments show that the proposed approach can (1) improve most state-of-the-art models on the NYUD v2 dataset by effectively leveraging additional unlabeled data sources; (2) reach state-of-the-art accuracy when the training set is small, e.g., on the Make3D dataset; (3) adapt well to an unseen new dataset (Make3D in our case) after training on an annotated dataset (KITTI in our case). |
Tasks | Depth Estimation, Monocular Depth Estimation |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.02126v1 |
https://arxiv.org/pdf/1908.02126v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-adversarial-monocular-depth |
Repo | |
Framework | |
Adaptively Denoising Proposal Collection for Weakly Supervised Object Localization
Title | Adaptively Denoising Proposal Collection for Weakly Supervised Object Localization |
Authors | Wenju Xu, Yuanwei Wu, Wenchi Ma, Guanghui Wang |
Abstract | In this paper, we address the problem of weakly supervised object localization (WSL), which trains a detection network on the dataset with only image-level annotations. The proposed approach is built on the observation that the proposal set from the training dataset is a collection of background, object parts, and objects. Several strategies are taken to adaptively eliminate the noisy proposals and generate pseudo object-level annotations for the weakly labeled dataset. A multiple instance learning (MIL) algorithm enhanced by mask-out strategy is adopted to collect the class-specific object proposals, which are then utilized to adapt a pre-trained classification network to a detection network. In addition, the detection results from the detection network are re-weighted by jointly considering the detection scores and the overlap ratio of proposals in a proposal subset optimization framework. The optimal proposals work as object-level labels that enable a pseudo-strongly supervised dataset for training the detection network. Consequently, we establish a fully adaptive detection network. Extensive evaluations on the PASCAL VOC 2007 and 2012 datasets demonstrate a significant improvement compared with the state-of-the-art methods. |
Tasks | Denoising, Multiple Instance Learning, Object Localization, Weakly-Supervised Object Localization |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.02101v2 |
https://arxiv.org/pdf/1910.02101v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptively-denoising-proposal-collection-for |
Repo | |
Framework | |
A Rate-Distortion Framework for Explaining Neural Network Decisions
Title | A Rate-Distortion Framework for Explaining Neural Network Decisions |
Authors | Jan Macdonald, Stephan Wäldchen, Sascha Hauch, Gitta Kutyniok |
Abstract | We formalise the widespread idea of interpreting neural network decisions as an explicit optimisation problem in a rate-distortion framework. A set of input features is deemed relevant for a classification decision if the expected classifier score remains nearly constant when randomising the remaining features. We discuss the computational complexity of finding small sets of relevant features and show that the problem is complete for $\mathsf{NP}^\mathsf{PP}$, an important class of computational problems frequently arising in AI tasks. Furthermore, we show that it even remains $\mathsf{NP}$-hard to only approximate the optimal solution to within any non-trivial approximation factor. Finally, we consider a continuous problem relaxation and develop a heuristic solution strategy based on assumed density filtering for deep ReLU neural networks. We present numerical experiments for two image classification data sets where we outperform established methods in particular for sparse explanations of neural network decisions. |
Tasks | Image Classification |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11092v1 |
https://arxiv.org/pdf/1905.11092v1.pdf | |
PWC | https://paperswithcode.com/paper/a-rate-distortion-framework-for-explaining |
Repo | |
Framework | |
Unique Sharp Local Minimum in $\ell_1$-minimization Complete Dictionary Learning
Title | Unique Sharp Local Minimum in $\ell_1$-minimization Complete Dictionary Learning |
Authors | Yu Wang, Siqi Wu, Bin Yu |
Abstract | We study the problem of globally recovering a dictionary from a set of signals via $\ell_1$-minimization. We assume that the signals are generated as i.i.d. random linear combinations of the $K$ atoms from a complete reference dictionary $D^*\in \mathbb R^{K\times K}$, where the linear combination coefficients are from either a Bernoulli type model or exact sparse model. First, we obtain a necessary and sufficient norm condition for the reference dictionary $D^*$ to be a sharp local minimum of the expected $\ell_1$ objective function. Our result substantially extends that of Wu and Yu (2015) and allows the combination coefficient to be non-negative. Secondly, we obtain an explicit bound on the region within which the objective value of the reference dictionary is minimal. Thirdly, we show that the reference dictionary is the unique sharp local minimum, thus establishing the first known global property of $\ell_1$-minimization dictionary learning. Motivated by the theoretical results, we introduce a perturbation-based test to determine whether a dictionary is a sharp local minimum of the objective function. In addition, we also propose a new dictionary learning algorithm based on Block Coordinate Descent, called DL-BCD, which is guaranteed to have monotonic convergence. Simulation studies show that DL-BCD has competitive performance in terms of recovery rate compared to many state-of-the-art dictionary learning algorithms. |
Tasks | Dictionary Learning |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08380v1 |
http://arxiv.org/pdf/1902.08380v1.pdf | |
PWC | https://paperswithcode.com/paper/unique-sharp-local-minimum-in-ell_1 |
Repo | |
Framework | |
Learning to Track Any Object
Title | Learning to Track Any Object |
Authors | Achal Dave, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan |
Abstract | Object tracking can be formulated as “finding the right object in a video”. We observe that recent approaches for class-agnostic tracking tend to focus on the “finding” part, but largely overlook the “object” part of the task, essentially doing a template matching over a frame in a sliding-window. In contrast, class-specific trackers heavily rely on object priors in the form of category-specific object detectors. In this work, we re-purpose category-specific appearance models into a generic objectness prior. Our approach converts a category-specific object detector into a category-agnostic, object-specific detector (i.e. a tracker) efficiently, on the fly. Moreover, at test time the same network can be applied to detection and tracking, resulting in a unified approach for the two tasks. We achieve state-of-the-art results on two recent large-scale tracking benchmarks (OxUvA and GOT, using external data). By simply adding a mask prediction branch, our approach is able to produce instance segmentation masks for the tracked object. Despite only using box-level information on the first frame, our method outputs high-quality masks, as evaluated on the DAVIS ‘17 video object segmentation benchmark. |
Tasks | Instance Segmentation, Object Tracking, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11844v1 |
https://arxiv.org/pdf/1910.11844v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-track-any-object |
Repo | |
Framework | |
Deep Learning of Subsurface Flow via Theory-guided Neural Network
Title | Deep Learning of Subsurface Flow via Theory-guided Neural Network |
Authors | Nanzhe Wang, Dongxiao Zhang, Haibin Chang, Heng Li |
Abstract | Active researches are currently being performed to incorporate the wealth of scientific knowledge into data-driven approaches (e.g., neural networks) in order to improve the latter’s effectiveness. In this study, the Theory-guided Neural Network (TgNN) is proposed for deep learning of subsurface flow. In the TgNN, as supervised learning, the neural network is trained with available observations or simulation data while being simultaneously guided by theory (e.g., governing equations, other physical constraints, engineering controls, and expert knowledge) of the underlying problem. The TgNN can achieve higher accuracy than the ordinary Artificial Neural Network (ANN) because the former provides physically feasible predictions and can be more readily generalized beyond the regimes covered with the training data. Furthermore, the TgNN model is proposed for subsurface flow with heterogeneous model parameters. Several numerical cases of two-dimensional transient saturated flow are introduced to test the performance of the TgNN. In the learning process, the loss function contains data mismatch, as well as PDE constraint, engineering control, and expert knowledge. After obtaining the parameters of the neural network by minimizing the loss function, a TgNN model is built that not only fits the data, but also adheres to physical/engineering constraints. Predicting the future response can be easily realized by the TgNN model. In addition, the TgNN model is tested in more complicated scenarios, such as prediction with changed boundary conditions, learning from noisy data or outliers, transfer learning, and engineering controls. Numerical results demonstrate that the TgNN model achieves much better predictability, reliability, and generalizability than ANN models due to the physical/engineering constraints in the former. |
Tasks | Transfer Learning |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1911.00103v1 |
https://arxiv.org/pdf/1911.00103v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-of-subsurface-flow-via-theory |
Repo | |
Framework | |