February 2, 2020

3789 words 18 mins read

Paper Group AWR 1

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards. Robustness Assessment for Adversarial Machine Learning: Problems, Solutions and a Survey of Current Neural Networks and Defenses. Generated Loss, Augmented Training, and Multiscale VAE. Stochastic Mirror Descent on Overparameterized Nonlinear Models: Convergence, Implicit …

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards


Title	Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards
Authors	Hou Pong Chan, Wang Chen, Lu Wang, Irwin King
Abstract	Generating keyphrases that summarize the main points of a document is a fundamental task in natural language processing. Although existing generative models are capable of predicting multiple keyphrases for an input document as well as determining the number of keyphrases to generate, they still suffer from the problem of generating too few keyphrases. To address this problem, we propose a reinforcement learning (RL) approach for keyphrase generation, with an adaptive reward function that encourages a model to generate both sufficient and accurate keyphrases. Furthermore, we introduce a new evaluation method that incorporates name variations of the ground-truth keyphrases using the Wikipedia knowledge base. Thus, our evaluation method can more robustly evaluate the quality of predicted keyphrases. Extensive experiments on five real-world datasets of different scales demonstrate that our RL approach consistently and significantly improves the performance of the state-of-the-art generative models with both conventional and new evaluation methods.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04106v1
PDF	https://arxiv.org/pdf/1906.04106v1.pdf
PWC	https://paperswithcode.com/paper/neural-keyphrase-generation-via-reinforcement
Repo	https://github.com/kenchan0226/keyphrase-generation-rl
Framework	pytorch

Robustness Assessment for Adversarial Machine Learning: Problems, Solutions and a Survey of Current Neural Networks and Defenses


Title	Robustness Assessment for Adversarial Machine Learning: Problems, Solutions and a Survey of Current Neural Networks and Defenses
Authors	Danilo Vasconcellos Vargas, Shashank Kotyan
Abstract	In adversarial machine learning, there are a huge number of attacks of various types which evaluates robustness for new models and defences a daunting task. To make matters worse, there is an inherent bias in attacks and defences. Here, we organize the problems faced (model dependence, insufficient evaluation, false adversarial samples and perturbation dependent results) and propose a model agnostic dual ($L_0$ and $L_\infty$) quality assessment method together with the concept of robustness levels to tackle them. We validate the dual quality assessment on state-of-the-art models (WideResNet, ResNet, AllConv, DenseNet, NIN, LeNet and CapsNet) as well as the current hardest defences proposed at ICLR 2018 and the widely known adversarial training, showing that current models and defences are vulnerable in all levels of robustness. The robustness assessment show that depending on the metric used (i.e., $L_0$ or $L_\infty$) the robustness may change significantly and therefore duality should be taken into account for a correct assessment. Moreover, a mathematical derivation, as well as a counterexample, suggest that $L_1$ and $L_2$ metrics alone are not enough to avoid false adversarial samples. Interestingly, a by-product of the assessment proposed is a novel $L_\infty$ black-box method which requires even less perturbation than the One-Pixel Attack (only 12% of One-Pixel Attack’s amount of perturbation) to achieve similar results. Thus, this paper elucidates the problems of robustness evaluation, proposes a dual quality assessment to tackle them as well as survey the robustness of current models and defences. Code available at http://bit.ly/DualQualityAssessment.
Tasks
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06026v2
PDF	https://arxiv.org/pdf/1906.06026v2.pdf
PWC	https://paperswithcode.com/paper/model-agnostic-dual-quality-assessment-for
Repo	https://github.com/shashankkotyan/DualQualityAssessment
Framework	tf

Generated Loss, Augmented Training, and Multiscale VAE


Title	Generated Loss, Augmented Training, and Multiscale VAE
Authors	Jason Chou, Gautam Hathi
Abstract	The variational autoencoder (VAE) framework remains a popular option for training unsupervised generative models, especially for discrete data where generative adversarial networks (GANs) require workaround to create gradient for the generator. In our work modeling US postal addresses, we show that our discrete VAE with tree recursive architecture demonstrates limited capability of capturing field correlations within structured data, even after overcoming the challenge of posterior collapse with scheduled sampling and tuning of the KL-divergence weight $\beta$. Worse, VAE seems to have difficulty mapping its generated samples to the latent space, as their VAE loss lags behind or even increases during the training process. Motivated by this observation, we show that augmenting training data with generated variants (augmented training) and training a VAE with multiple values of $\beta$ simultaneously (multiscale VAE) both improve the generation quality of VAE. Despite their differences in motivation and emphasis, we show that augmented training and multiscale VAE are actually connected and have similar effects on the model.
Tasks
Published	2019-04-23
URL	http://arxiv.org/abs/1904.10446v1
PDF	http://arxiv.org/pdf/1904.10446v1.pdf
PWC	https://paperswithcode.com/paper/generated-loss-augmented-training-and
Repo	https://github.com/EIFY/vermont_address
Framework	none

Stochastic Mirror Descent on Overparameterized Nonlinear Models: Convergence, Implicit Regularization, and Generalization


Title	Stochastic Mirror Descent on Overparameterized Nonlinear Models: Convergence, Implicit Regularization, and Generalization
Authors	Navid Azizan, Sahin Lale, Babak Hassibi
Abstract	Most modern learning problems are highly overparameterized, meaning that there are many more parameters than the number of training data points, and as a result, the training loss may have infinitely many global minima (parameter vectors that perfectly interpolate the training data). Therefore, it is important to understand which interpolating solutions we converge to, how they depend on the initialization point and the learning algorithm, and whether they lead to different generalization performances. In this paper, we study these questions for the family of stochastic mirror descent (SMD) algorithms, of which the popular stochastic gradient descent (SGD) is a special case. Our contributions are both theoretical and experimental. On the theory side, we show that in the overparameterized nonlinear setting, if the initialization is close enough to the manifold of global minima (something that comes for free in the highly overparameterized case), SMD with sufficiently small step size converges to a global minimum that is approximately the closest one in Bregman divergence. On the experimental side, our extensive experiments on standard datasets and models, using various initializations, various mirror descents, and various Bregman divergences, consistently confirms that this phenomenon happens in deep learning. Our experiments further indicate that there is a clear difference in the generalization performance of the solutions obtained by different SMD algorithms. Experimenting on a standard image dataset and network architecture with SMD with different kinds of implicit regularization, $\ell_1$ to encourage sparsity, $\ell_2$ yielding SGD, and $\ell_{10}$ to discourage large components in the parameter vector, consistently and definitively shows that $\ell_{10}$-SMD has better generalization performance than SGD, which in turn has better generalization performance than $\ell_1$-SMD.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03830v1
PDF	https://arxiv.org/pdf/1906.03830v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-mirror-descent-on
Repo	https://github.com/SahinLale/StochasticMirrorDescent
Framework	pytorch

Quantifying and Alleviating the Language Prior Problem in Visual Question Answering


Title	Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Authors	Yangyang Guo, Zhiyong Cheng, Liqiang Nie, Yibing Liu, Yinglong Wang, Mohan Kankanhalli
Abstract	Benefiting from the advancement of computer vision, natural language processing and information retrieval techniques, visual question answering (VQA), which aims to answer questions about an image or a video, has received lots of attentions over the past few years. Although some progress has been achieved so far, several studies have pointed out that current VQA models are heavily affected by the \emph{language prior problem}, which means they tend to answer questions based on the co-occurrence patterns of question keywords (e.g., how many) and answers (e.g., 2) instead of understanding images and questions. Existing methods attempt to solve this problem by either balancing the biased datasets or forcing models to better understand images. However, only marginal effects and even performance deterioration are observed for the first and second solution, respectively. In addition, another important issue is the lack of measurement to quantitatively measure the extent of the language prior effect, which severely hinders the advancement of related techniques. In this paper, we make contributions to solve the above problems from two perspectives. Firstly, we design a metric to quantitatively measure the language prior effect of VQA models. The proposed metric has been demonstrated to be effective in our empirical studies. Secondly, we propose a regularization method (i.e., score regularization module) to enhance current VQA models by alleviating the language prior problem as well as boosting the backbone model performance. The proposed score regularization module adopts a pair-wise learning strategy, which makes the VQA models answer the question based on the reasoning of the image (upon this question) instead of basing on question-answer patterns observed in the biased training set. The score regularization module is flexible to be integrated into various VQA models.
Tasks	Information Retrieval, Question Answering, Visual Question Answering
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04877v1
PDF	https://arxiv.org/pdf/1905.04877v1.pdf
PWC	https://paperswithcode.com/paper/quantifying-and-alleviating-the-language
Repo	https://github.com/guoyang9/vqa-prior
Framework	pytorch

Recovery Guarantees for Compressible Signals with Adversarial Noise


Title	Recovery Guarantees for Compressible Signals with Adversarial Noise
Authors	Jasjeet Dhaliwal, Kyle Hambrook
Abstract	We provide recovery guarantees for compressible signals that have been corrupted with noise and extend the framework introduced in \cite{bafna2018thwarting} to defend neural networks against $\ell_0$-norm, $\ell_2$-norm, and $\ell_{\infty}$-norm attacks. Our results are general as they can be applied to most unitary transforms used in practice and hold for $\ell_0$-norm, $\ell_2$-norm, and $\ell_\infty$-norm bounded noise. In the case of $\ell_0$-norm noise, we prove recovery guarantees for Iterative Hard Thresholding (IHT) and Basis Pursuit (BP). For $\ell_2$-norm bounded noise, we provide recovery guarantees for BP and for the case of $\ell_\infty$-norm bounded noise, we provide recovery guarantees for Dantzig Selector (DS). These guarantees theoretically bolster the defense framework introduced in \cite{bafna2018thwarting} for defending neural networks against adversarial inputs. Finally, we experimentally demonstrate the effectiveness of this defense framework against an array of $\ell_0$, $\ell_2$ and $\ell_\infty$ norm attacks.
Tasks
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06565v3
PDF	https://arxiv.org/pdf/1907.06565v3.pdf
PWC	https://paperswithcode.com/paper/recovery-guarantees-for-compressible-signals
Repo	https://github.com/jasjeetIM/recovering_compressible_signals
Framework	tf

Repurposing Entailment for Multi-Hop Question Answering Tasks


Title	Repurposing Entailment for Multi-Hop Question Answering Tasks
Authors	Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
Abstract	Question Answering (QA) naturally reduces to an entailment problem, namely, verifying whether some text entails the answer to a question. However, for multi-hop QA tasks, which require reasoning with multiple sentences, it remains unclear how best to utilize entailment models pre-trained on large scale datasets such as SNLI, which are based on sentence pairs. We introduce Multee, a general architecture that can effectively use entailment models for multi-hop QA tasks. Multee uses (i) a local module that helps locate important sentences, thereby avoiding distracting information, and (ii) a global module that aggregates information by effectively incorporating importance weights. Importantly, we show that both modules can use entailment functions pre-trained on a large scale NLI datasets. We evaluate performance on MultiRC and OpenBookQA, two multihop QA datasets. When using an entailment function pre-trained on NLI datasets, Multee outperforms QA models trained only on the target QA datasets and the OpenAI transformer models. The code is available at https://github.com/StonyBrookNLP/multee.
Tasks	Question Answering
Published	2019-04-20
URL	http://arxiv.org/abs/1904.09380v1
PDF	http://arxiv.org/pdf/1904.09380v1.pdf
PWC	https://paperswithcode.com/paper/190409380
Repo	https://github.com/StonyBrookNLP/multee
Framework	pytorch

Learning Temporal Pose Estimation from Sparsely-Labeled Videos


Title	Learning Temporal Pose Estimation from Sparsely-Labeled Videos
Authors	Gedas Bertasius, Christoph Feichtenhofer, Du Tran, Jianbo Shi, Lorenzo Torresani
Abstract	Modern approaches for multi-person pose estimation in video require large amounts of dense annotations. However, labeling every frame in a video is costly and labor intensive. To reduce the need for dense annotations, we propose a PoseWarper network that leverages training videos with sparse annotations (every k frames) to learn to perform dense temporal pose propagation and estimation. Given a pair of video frames—a labeled Frame A and an unlabeled Frame B—we train our model to predict human pose in Frame A using the features from Frame B by means of deformable convolutions to implicitly learn the pose warping between A and B. We demonstrate that we can leverage our trained PoseWarper for several applications. First, at inference time we can reverse the application direction of our network in order to propagate pose information from manually annotated frames to unlabeled frames. This makes it possible to generate pose annotations for the entire video given only a few manually-labeled frames. Compared to modern label propagation methods based on optical flow, our warping mechanism is much more compact (6M vs 39M parameters), and also more accurate (88.7% mAP vs 83.8% mAP). We also show that we can improve the accuracy of a pose estimator by training it on an augmented dataset obtained by adding our propagated poses to the original manual labels. Lastly, we can use our PoseWarper to aggregate temporal pose information from neighboring frames during inference. This allows our system to achieve state-of-the-art pose detection results on the PoseTrack2017 and PoseTrack2018 datasets. Code has been made available at: https://github.com/facebookresearch/PoseWarper.
Tasks	Multi-Person Pose Estimation, Optical Flow Estimation, Pose Estimation
Published	2019-06-06
URL	https://arxiv.org/abs/1906.04016v3
PDF	https://arxiv.org/pdf/1906.04016v3.pdf
PWC	https://paperswithcode.com/paper/learning-temporal-pose-estimation-from
Repo	https://github.com/facebookresearch/PoseWarper
Framework	pytorch

False Data Injection Attacks in Internet of Things and Deep Learning enabled Predictive Analytics


Title	False Data Injection Attacks in Internet of Things and Deep Learning enabled Predictive Analytics
Authors	Gautam Raj Mode, Prasad Calyam, Khaza Anuarul Hoque
Abstract	Industry 4.0 is the latest industrial revolution primarily merging automation with advanced manufacturing to reduce direct human effort and resources. Predictive maintenance (PdM) is an industry 4.0 solution, which facilitates predicting faults in a component or a system powered by state-of-the-art machine learning (ML) algorithms and the Internet-of-Things (IoT) sensors. However, IoT sensors and deep learning (DL) algorithms, both are known for their vulnerabilities to cyber-attacks. In the context of PdM systems, such attacks can have catastrophic consequences as they are hard to detect due to the nature of the attack. To date, the majority of the published literature focuses on the accuracy of DL enabled PdM systems and often ignores the effect of such attacks. In this paper, we demonstrate the effect of IoT sensor attacks on a PdM system. At first, we use three state-of-the-art DL algorithms, specifically, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Convolutional Neural Network (CNN) for predicting the Remaining Useful Life (RUL) of a turbofan engine using NASA’s C-MAPSS dataset. The obtained results show that the GRU-based PdM model outperforms some of the recent literature on RUL prediction using the C-MAPSS dataset. Afterward, we model two different types of false data injection attacks (FDIA) on turbofan engine sensor data and evaluate their impact on CNN, LSTM, and GRU-based PdM systems. The obtained results demonstrate that FDI attacks on even a few IoT sensors can strongly defect the RUL prediction. However, the GRU-based PdM model performs better in terms of accuracy and resiliency. Lastly, we perform a study on the GRU-based PdM model using four different GRU networks with different sequence lengths. Our experiments reveal an interesting relationship between the accuracy, resiliency and sequence length for the GRU-based PdM models.
Tasks
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01716v4
PDF	https://arxiv.org/pdf/1910.01716v4.pdf
PWC	https://paperswithcode.com/paper/false-data-injection-attacks-in-internet-of
Repo	https://github.com/dependable-cps/FDIA-PdM
Framework	none

Latent Variable Sentiment Grammar


Title	Latent Variable Sentiment Grammar
Authors	Liwen Zhang, Kewei Tu, Yue Zhang
Abstract	Neural models have been investigated for sentiment classification over constituent trees. They learn phrase composition automatically by encoding tree structures but do not explicitly model sentiment composition, which requires to encode sentiment class labels. To this end, we investigate two formalisms with deep sentiment representations that capture sentiment subtype expressions by latent variables and Gaussian mixture vectors, respectively. Experiments on Stanford Sentiment Treebank (SST) show the effectiveness of sentiment grammar over vanilla neural encoders. Using ELMo embeddings, our method gives the best results on this benchmark.
Tasks	Sentiment Analysis
Published	2019-06-29
URL	https://arxiv.org/abs/1907.00218v2
PDF	https://arxiv.org/pdf/1907.00218v2.pdf
PWC	https://paperswithcode.com/paper/latent-variable-sentiment-grammar
Repo	https://github.com/Ehaschia/bi-tree-lstm-crf
Framework	pytorch


Title	Faking and Discriminating the Navigation Data of a Micro Aerial Vehicle Using Quantum Generative Adversarial Networks
Authors	Michel Barbeau, Joaquin Garcia-Alfaro
Abstract	We show that the Quantum Generative Adversarial Network (QGAN) paradigm can be employed by an adversary to learn generating data that deceives the monitoring of a Cyber-Physical System (CPS) and to perpetrate a covert attack. As a test case, the ideas are elaborated considering the navigation data of a Micro Aerial Vehicle (MAV). A concrete QGAN design is proposed to generate fake MAV navigation data. Initially, the adversary is entirely ignorant about the dynamics of the CPS, the strength of the approach from the point of view of the bad guy. A design is also proposed to discriminate between genuine and fake MAV navigation data. The designs combine classical optimization, qubit quantum computing and photonic quantum computing. Using the PennyLane software simulation, they are evaluated over a classical computing platform. We assess the learning time and accuracy of the navigation data generator and discriminator versus space complexity, i.e., the amount of quantum memory needed to solve the problem.
Tasks
Published	2019-07-05
URL	https://arxiv.org/abs/1907.03038v3
PDF	https://arxiv.org/pdf/1907.03038v3.pdf
PWC	https://paperswithcode.com/paper/faking-and-discriminating-the-navigation-data
Repo	https://github.com/jgalfaro/mirrored-QGANMAV
Framework	none

Please Stop Permuting Features: An Explanation and Alternatives


Title	Please Stop Permuting Features: An Explanation and Alternatives
Authors	Giles Hooker, Lucas Mentch
Abstract	This paper advocates against permute-and-predict (PaP) methods for interpreting black box functions. Methods such as the variable importance measures proposed for random forests, partial dependence plots, and individual conditional expectation plots remain popular because of their ability to provide model-agnostic measures that depend only on the pre-trained model output. However, numerous studies have found that these tools can produce diagnostics that are highly misleading, particularly when there is strong dependence among features. Rather than simply add to this growing literature by further demonstrating such issues, here we seek to provide an explanation for the observed behavior. In particular, we argue that breaking dependencies between features in hold-out data places undue emphasis on sparse regions of the feature space by forcing the original model to extrapolate to regions where there is little to no data. We explore these effects through various settings where a ground-truth is understood and find support for previous claims in the literature that PaP metrics tend to over-emphasize correlated features both in variable importance and partial dependence plots, even though applying permutation methods to the ground-truth models do not. As an alternative, we recommend more direct approaches that have proven successful in other settings: explicitly removing features, conditional permutations, or model distillation methods.
Tasks
Published	2019-05-01
URL	http://arxiv.org/abs/1905.03151v1
PDF	http://arxiv.org/pdf/1905.03151v1.pdf
PWC	https://paperswithcode.com/paper/190503151
Repo	https://github.com/antonFJohansson/Please-Stop-Permuting-Features
Framework	none

Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments


Title	Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments
Authors	Vasilis Syrgkanis, Victor Lei, Miruna Oprescu, Maggie Hei, Keith Battocchi, Greg Lewis
Abstract	We consider the estimation of heterogeneous treatment effects with arbitrary machine learning methods in the presence of unobserved confounders with the aid of a valid instrument. Such settings arise in A/B tests with an intent-to-treat structure, where the experimenter randomizes over which user will receive a recommendation to take an action, and we are interested in the effect of the downstream action. We develop a statistical learning approach to the estimation of heterogeneous effects, reducing the problem to the minimization of an appropriate loss function that depends on a set of auxiliary models (each corresponding to a separate prediction task). The reduction enables the use of all recent algorithmic advances (e.g. neural nets, forests). We show that the estimated effect model is robust to estimation errors in the auxiliary models, by showing that the loss satisfies a Neyman orthogonality criterion. Our approach can be used to estimate projections of the true effect model on simpler hypothesis spaces. When these spaces are parametric, then the parameter estimates are asymptotically normal, which enables construction of confidence sets. We applied our method to estimate the effect of membership on downstream webpage engagement on TripAdvisor, using as an instrument an intent-to-treat A/B test among 4 million TripAdvisor users, where some users received an easier membership sign-up process. We also validate our method on synthetic data and on public datasets for the effects of schooling on income.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10176v3
PDF	https://arxiv.org/pdf/1905.10176v3.pdf
PWC	https://paperswithcode.com/paper/machine-learning-estimation-of-heterogeneous-1
Repo	https://github.com/Microsoft/EconML
Framework	none

Learning Nonsymmetric Determinantal Point Processes


Title	Learning Nonsymmetric Determinantal Point Processes
Authors	Mike Gartrell, Victor-Emmanuel Brunel, Elvis Dohmatob, Syrine Krichene
Abstract	Determinantal point processes (DPPs) have attracted substantial attention as an elegant probabilistic model that captures the balance between quality and diversity within sets. DPPs are conventionally parameterized by a positive semi-definite kernel matrix, and this symmetric kernel encodes only repulsive interactions between items. These so-called symmetric DPPs have significant expressive power, and have been successfully applied to a variety of machine learning tasks, including recommendation systems, information retrieval, and automatic summarization, among many others. Efficient algorithms for learning symmetric DPPs and sampling from these models have been reasonably well studied. However, relatively little attention has been given to nonsymmetric DPPs, which relax the symmetric constraint on the kernel. Nonsymmetric DPPs allow for both repulsive and attractive item interactions, which can significantly improve modeling power, resulting in a model that may better fit for some applications. We present a method that enables a tractable algorithm, based on maximum likelihood estimation, for learning nonsymmetric DPPs from data composed of observed subsets. Our method imposes a particular decomposition of the nonsymmetric kernel that enables such tractable learning algorithms, which we analyze both theoretically and experimentally. We evaluate our model on synthetic and real-world datasets, demonstrating improved predictive performance compared to symmetric DPPs, which have previously shown strong performance on modeling tasks associated with these datasets.
Tasks	Information Retrieval, Point Processes, Recommendation Systems
Published	2019-05-30
URL	https://arxiv.org/abs/1905.12962v2
PDF	https://arxiv.org/pdf/1905.12962v2.pdf
PWC	https://paperswithcode.com/paper/learning-nonsymmetric-determinantal-point
Repo	https://github.com/cgartrel/nonsymmetric-DPP-learning
Framework	pytorch

Tightness-aware Evaluation Protocol for Scene Text Detection


Title	Tightness-aware Evaluation Protocol for Scene Text Detection
Authors	Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie
Abstract	Evaluation protocols play key role in the developmental progress of text detection methods. There are strict requirements to ensure that the evaluation methods are fair, objective and reasonable. However, existing metrics exhibit some obvious drawbacks: 1) They are not goal-oriented; 2) they cannot recognize the tightness of detection methods; 3) existing one-to-many and many-to-one solutions involve inherent loopholes and deficiencies. Therefore, this paper proposes a novel evaluation protocol called Tightness-aware Intersect-over-Union (TIoU) metric that could quantify completeness of ground truth, compactness of detection, and tightness of matching degree. Specifically, instead of merely using the IoU value, two common detection behaviors are properly considered; meanwhile, directly using the score of TIoU to recognize the tightness. In addition, we further propose a straightforward method to address the annotation granularity issue, which can fairly evaluate word and text-line detections simultaneously. By adopting the detection results from published methods and general object detection frameworks, comprehensive experiments on ICDAR 2013 and ICDAR 2015 datasets are conducted to compare recent metrics and the proposed TIoU metric. The comparison demonstrated some promising new prospects, e.g., determining the methods and frameworks for which the detection is tighter and more beneficial to recognize. Our method is extremely simple; however, the novelty is none other than the proposed metric can utilize simplest but reasonable improvements to lead to many interesting and insightful prospects and solving most the issues of the previous metrics. The code is publicly available at https://github.com/Yuliang-Liu/TIoU-metric .
Tasks	Object Detection, Scene Text Detection
Published	2019-03-27
URL	http://arxiv.org/abs/1904.00813v1
PDF	http://arxiv.org/pdf/1904.00813v1.pdf
PWC	https://paperswithcode.com/paper/tightness-aware-evaluation-protocol-for-scene
Repo	https://github.com/Yuliang-Liu/TIoU-metric
Framework	none