January 30, 2020

3244 words 16 mins read

Paper Group ANR 303

Paper Group ANR 303

RPC: A Large-Scale Retail Product Checkout Dataset. A machine learning method correlating pulse pressure wave data with pregnancy. Representation Learning on Unit Ball with 3D Roto-Translational Equivariance. Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning. Automatic Hierarchical Classification of Kelps us …

RPC: A Large-Scale Retail Product Checkout Dataset

Title RPC: A Large-Scale Retail Product Checkout Dataset
Authors Xiu-Shen Wei, Quan Cui, Lei Yang, Peng Wang, Lingqiao Liu
Abstract Over recent years, emerging interest has occurred in integrating computer vision technology into the retail industry. Automatic checkout (ACO) is one of the critical problems in this area which aims to automatically generate the shopping list from the images of the products to purchase. The main challenge of this problem comes from the large scale and the fine-grained nature of the product categories as well as the difficulty for collecting training images that reflect the realistic checkout scenarios due to continuous update of the products. Despite its significant practical and research value, this problem is not extensively studied in the computer vision community, largely due to the lack of a high-quality dataset. To fill this gap, in this work we propose a new dataset to facilitate relevant research. Our dataset enjoys the following characteristics: (1) It is by far the largest dataset in terms of both product image quantity and product categories. (2) It includes single-product images taken in a controlled environment and multi-product images taken by the checkout system. (3) It provides different levels of annotations for the check-out images. Comparing with the existing datasets, ours is closer to the realistic setting and can derive a variety of research problems. Besides the dataset, we also benchmark the performance on this dataset with various approaches. The dataset and related resources can be found at \url{https://rpc-dataset.github.io/}.
Tasks
Published 2019-01-22
URL http://arxiv.org/abs/1901.07249v1
PDF http://arxiv.org/pdf/1901.07249v1.pdf
PWC https://paperswithcode.com/paper/rpc-a-large-scale-retail-product-checkout
Repo
Framework

A machine learning method correlating pulse pressure wave data with pregnancy

Title A machine learning method correlating pulse pressure wave data with pregnancy
Authors Jianhong Chen, Huang Huang, Wenrui Hao, Jinchao Xu
Abstract Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key features in TCM, and pregnancy by using deep learning technology. This computational approach shows that the accuracy of pregnancy detection by the PPW is 84% with an AUC of 91%. Our study is a proof of concept of pulse diagnosis and will also motivate further sophisticated investigations on pulse waves.
Tasks
Published 2019-10-03
URL https://arxiv.org/abs/1910.01726v1
PDF https://arxiv.org/pdf/1910.01726v1.pdf
PWC https://paperswithcode.com/paper/a-machine-learning-method-correlating-pulse
Repo
Framework

Representation Learning on Unit Ball with 3D Roto-Translational Equivariance

Title Representation Learning on Unit Ball with 3D Roto-Translational Equivariance
Authors Sameera Ramasinghe, Salman Khan, Nick Barnes, Stephen Gould
Abstract Convolution is an integral operation that defines how the shape of one function is modified by another function. This powerful concept forms the basis of hierarchical feature learning in deep neural networks. Although performing convolution in Euclidean geometries is fairly straightforward, its extension to other topological spaces—such as a sphere ($\mathbb{S}^2$) or a unit ball ($\mathbb{B}^3$)—entails unique challenges. In this work, we propose a novel `\emph{volumetric convolution}’ operation that can effectively model and convolve arbitrary functions in $\mathbb{B}^3$. We develop a theoretical framework for \emph{volumetric convolution} based on Zernike polynomials and efficiently implement it as a differentiable and an easily pluggable layer in deep networks. By construction, our formulation leads to the derivation of a novel formula to measure the symmetry of a function in $\mathbb{B}^3$ around an arbitrary axis, that is useful in function analysis tasks. We demonstrate the efficacy of proposed volumetric convolution operation on one viable use case i.e., 3D object recognition. |
Tasks 3D Object Recognition, Object Recognition, Representation Learning
Published 2019-11-30
URL https://arxiv.org/abs/1912.01454v1
PDF https://arxiv.org/pdf/1912.01454v1.pdf
PWC https://paperswithcode.com/paper/representation-learning-on-unit-ball-with-3d
Repo
Framework

Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning

Title Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning
Authors Valentin Vielzeuf, Alexis Lechervy, Stéphane Pateux, Frédéric Jurie
Abstract This paper proposes a step toward obtaining general models of knowledge for facial analysis, by addressing the question of multi-source transfer learning. More precisely, the proposed approach consists in two successive training steps: the first one consists in applying a combination operator to define a common embedding for the multiple sources materialized by different existing trained models. The proposed operator relies on an auto-encoder, trained on a large dataset, efficient both in terms of compression ratio and transfer learning performance. In a second step we exploit a distillation approach to obtain a lightweight student model mimicking the collection of the fused existing models. This model outperforms its teacher on novel tasks, achieving results on par with state-of-the-art methods on 15 facial analysis tasks (and domains), at an affordable training cost. Moreover, this student has 75 times less parameters than the original teacher and can be applied to a variety of novel face-related tasks.
Tasks Transfer Learning
Published 2019-11-08
URL https://arxiv.org/abs/1911.03222v1
PDF https://arxiv.org/pdf/1911.03222v1.pdf
PWC https://paperswithcode.com/paper/towards-a-general-model-of-knowledge-for
Repo
Framework

Automatic Hierarchical Classification of Kelps using Deep Residual Features

Title Automatic Hierarchical Classification of Kelps using Deep Residual Features
Authors Ammar Mahmood, Ana Giraldo Ospina, Mohammed Bennamoun, Senjian An, Ferdous Sohel, Farid Boussaid, Renae Hovey, Robert B. Fisher, Gary Kendrick
Abstract Across the globe, remote image data is rapidly being collected for the assessment of benthic communities from shallow to extremely deep waters on continental slopes to the abyssal seas. Exploiting this data is presently limited by the time it takes for experts to identify organisms found in these images. With this limitation in mind, a large effort has been made globally to introduce automation and machine learning algorithms to accelerate both classification and assessment of marine benthic biota. One major issue lies with organisms that move with swell and currents, like kelps. This paper presents an automatic hierarchical classification method (local binary classification as opposed to the conventional flat classification) to classify kelps in images collected by autonomous underwater vehicles. The proposed kelp classification approach exploits learned feature representations extracted from deep residual networks. We show that these generic features outperform the traditional off-the-shelf CNN features and the conventional hand-crafted features. Experiments also demonstrate that the hierarchical classification method outperforms the traditional parallel multi-class classifications by a significant margin (90.0% vs 57.6% and 77.2% vs 59.0%) on Benthoz15 and Rottnest datasets respectively. Furthermore, we compare different hierarchical classification approaches and experimentally show that the sibling hierarchical training approach outperforms the inclusive hierarchical approach by a significant margin. We also report an application of our proposed method to study the change in kelp cover over time for annually repeated AUV surveys.
Tasks
Published 2019-06-26
URL https://arxiv.org/abs/1906.10881v2
PDF https://arxiv.org/pdf/1906.10881v2.pdf
PWC https://paperswithcode.com/paper/hierarchical-classification-of-kelps
Repo
Framework

FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

Title FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data
Authors Georg Krispel, Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof
Abstract We introduce a simple yet effective fusion method of LiDAR and RGB data to segment LiDAR point clouds. Utilizing the dense native range representation of a LiDAR sensor and the setup calibration, we establish point correspondences between the two input modalities. Subsequently, we are able to warp and fuse the features from one domain into the other. Therefore, we can jointly exploit information from both data sources within one single network. To show the merit of our method, we extend SqueezeSeg, a point cloud segmentation network, with an RGB feature branch and fuse it into the original structure. Our extension called FuseSeg leads to an improvement of up to 18% IoU on the KITTI benchmark. In addition to the improved accuracy, we also achieve real-time performance at 50 fps, five times as fast as the KITTI LiDAR data recording speed.
Tasks Calibration
Published 2019-12-18
URL https://arxiv.org/abs/1912.08487v2
PDF https://arxiv.org/pdf/1912.08487v2.pdf
PWC https://paperswithcode.com/paper/fuseseg-lidar-point-cloud-segmentation-fusing
Repo
Framework

Deep Learning Based Robot for Automatically Picking up Garbage on the Grass

Title Deep Learning Based Robot for Automatically Picking up Garbage on the Grass
Authors Jinqiang Bai, Shiguo Lian, Zhaoxiang Liu, Kai Wang, Dijun Liu
Abstract This paper presents a novel garbage pickup robot which operates on the grass. The robot is able to detect the garbage accurately and autonomously by using a deep neural network for garbage recognition. In addition, with the ground segmentation using a deep neural network, a novel navigation strategy is proposed to guide the robot to move around. With the garbage recognition and automatic navigation functions, the robot can clean garbage on the ground in places like parks or schools efficiently and autonomously. Experimental results show that the garbage recognition accuracy can reach as high as 95%, and even without path planning, the navigation strategy can reach almost the same cleaning efficiency with traditional methods. Thus, the proposed robot can serve as a good assistance to relieve dustman’s physical labor on garbage cleaning tasks.
Tasks
Published 2019-04-30
URL http://arxiv.org/abs/1904.13034v1
PDF http://arxiv.org/pdf/1904.13034v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-based-robot-for-automatically
Repo
Framework

A simple and effective hybrid genetic search for the job sequencing and tool switching problem

Title A simple and effective hybrid genetic search for the job sequencing and tool switching problem
Authors Jordana Mecler, Anand Subramanian, Thibaut Vidal
Abstract The job sequencing and tool switching problem (SSP) has been extensively studied in the field of operations research, due to its practical relevance and methodological interest. Given a machine that can load a limited amount of tools simultaneously and a number of jobs that require a subset of the available tools, the SSP seeks a job sequence that minimizes the number of tool switches in the machine. To solve this problem, we propose a simple and efficient hybrid genetic search based on a generic solution representation, a tailored decoding operator, efficient local searches and diversity management techniques. To guide the search, we introduce a secondary objective designed to break ties. These techniques allow to explore structurally different solutions and escape local optima. As shown in our computational experiments on classical benchmark instances, our algorithm significantly outperforms all previous approaches while remaining simple to apprehend and easy to implement. We finally report results on a new set of larger instances to stimulate future research and comparative analyses.
Tasks
Published 2019-10-10
URL https://arxiv.org/abs/1910.10021v1
PDF https://arxiv.org/pdf/1910.10021v1.pdf
PWC https://paperswithcode.com/paper/a-simple-and-effective-hybrid-genetic-search
Repo
Framework

An agglomerative hierarchical clustering method by optimizing the average silhouette width

Title An agglomerative hierarchical clustering method by optimizing the average silhouette width
Authors Fatima Batool
Abstract An agglomerative hierarchical clustering (AHC) framework and algorithm named HOSil based on a new linkage metric optimized by the average silhouette width (ASW) index is proposed. A conscientious investigation of various clustering methods and estimation indices is conducted across a diverse verities of data structures for three aims: a) clustering quality, b) clustering recovery, and c) estimation of number of clusters. HOSil has shown better clustering quality for a range of artificial and real world data structures as compared to k-means, PAM, single, complete, average, Ward, McQuitty, spectral, model-based, and several estimation methods. It can identify clusters of various shapes including spherical, elongated, relatively small sized clusters, clusters coming from different distributions including uniform, t, gamma and others. HOSil has shown good recovery for correct determination of the number of clusters. For some data structures only HOSil was able to identify the correct number of clusters.
Tasks
Published 2019-09-26
URL https://arxiv.org/abs/1909.12356v1
PDF https://arxiv.org/pdf/1909.12356v1.pdf
PWC https://paperswithcode.com/paper/an-agglomerative-hierarchical-clustering-1
Repo
Framework

Fault Diagnosis Method Based on Scaling Law for On-line Refrigerant Leak Detection

Title Fault Diagnosis Method Based on Scaling Law for On-line Refrigerant Leak Detection
Authors Shun Takeuchi, Takahiro Saito
Abstract Early fault detection using instrumented sensor data is one of the promising application areas of machine learning in industrial facilities. However, it is difficult to improve the generalization performance of the trained fault-detection model because of the complex system configuration in the target diagnostic system and insufficient fault data. It is not trivial to apply the trained model to other systems. Here we propose a fault diagnosis method for refrigerant leak detection considering the physical modeling and control mechanism of an air-conditioning system. We derive a useful scaling law related to refrigerant leak. If the control mechanism is the same, the model can be applied to other air-conditioning systems irrespective of the system configuration. Small-scale off-line fault test data obtained in a laboratory are applied to estimate the scaling exponent. We evaluate the proposed scaling law by using real-world data. Based on a statistical hypothesis test of the interaction between two groups, we show that the scaling exponents of different air-conditioning systems are equivalent. In addition, we estimated the time series of the degree of leakage of real process data based on the scaling law and confirmed that the proposed method is promising for early leak detection through comparison with assessment by experts.
Tasks Fault Detection, Time Series
Published 2019-02-22
URL http://arxiv.org/abs/1902.09427v1
PDF http://arxiv.org/pdf/1902.09427v1.pdf
PWC https://paperswithcode.com/paper/fault-diagnosis-method-based-on-scaling-law
Repo
Framework

Weakly-Supervised Opinion Summarization by Leveraging External Information

Title Weakly-Supervised Opinion Summarization by Leveraging External Information
Authors Chao Zhao, Snigdha Chaturvedi
Abstract Opinion summarization from online product reviews is a challenging task, which involves identifying opinions related to various aspects of the product being reviewed. While previous works require additional human effort to identify relevant aspects, we instead apply domain knowledge from external sources to automatically achieve the same goal. This work proposes AspMem, a generative method that contains an array of memory cells to store aspect-related knowledge. This explicit memory can help obtain a better opinion representation and infer the aspect information more precisely. We evaluate this method on both aspect identification and opinion summarization tasks. Our experiments show that AspMem outperforms the state-of-the-art methods even though, unlike the baselines, it does not rely on human supervision which is carefully handcrafted for the given tasks.
Tasks
Published 2019-11-22
URL https://arxiv.org/abs/1911.09844v1
PDF https://arxiv.org/pdf/1911.09844v1.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-opinion-summarization-by
Repo
Framework

Unbiased CVR Prediction from Biased Conversions in Display Advertising

Title Unbiased CVR Prediction from Biased Conversions in Display Advertising
Authors Yuta Saito, Gota Morishita, Shota Yasui
Abstract In display advertising, predicting the conversion rate is critical to deciding the optimal bid price for an advertisement. There are two troublesome difficulties in the conversion rate prediction task in the display advertising domain. First, some positive conversions are falsely observed as a negative label in training data, because they do not occur right after clicking the ads. Moreover, some positive feedback is much more frequently observed than the others, which creates an ununiform missing mechanism of conversions. It is widely acknowledged that these problems cause a severe bias in the naive empirical average loss function for the conversion rate prediction. To overcome the challenges, we formulate the conversion rate prediction task in display advertising from the statistical estimation perspective and propose an interactive learning algorithm where a conversion rate predictor and a bias estimator are learned alternately. Lastly, we conducted a simulation experiment to demonstrate that the proposed method outperforms the existing baseline models.
Tasks
Published 2019-10-04
URL https://arxiv.org/abs/1910.01847v2
PDF https://arxiv.org/pdf/1910.01847v2.pdf
PWC https://paperswithcode.com/paper/dual-learning-algorithm-for-delayed-feedback
Repo
Framework

Learning Bayesian networks from demographic and health survey data

Title Learning Bayesian networks from demographic and health survey data
Authors Neville Kenneth Kitson, Anthony C. Constantinou
Abstract Child mortality from preventable diseases such as pneumonia and diarrhoea in low and middle-income countries remains a serious global challenge. We combine knowledge with available Demographic and Health Survey (DHS) data from India, to construct Bayesian Networks (BNs) and investigate the factors associated with childhood diarrhoea. We make use of freeware tools to learn the graphical structure of the DHS data with score-based, constraint-based, and hybrid structure learning algorithms. We investigate the effect of missing values, sample size, and knowledge-based constraints on each of the structure learning algorithms and assess their accuracy with multiple scoring functions. Weaknesses in the survey methodology and data available, as well as the variability in the BNs generated, mean that is not possible to learn a definitive causal BN from data. However, knowledge-based constraints are found to be useful in reducing the variation in the graphs produced by the different algorithms, and produce graphs which are more reflective of the likely influential relationships in the data. Furthermore, valuable insights are gained into the performance and characteristics of the structure learning algorithms. Two score-based algorithms in particular, TABU and FGES, demonstrate many desirable qualities; a) with sufficient data, they produce a graph which is similar to the reference graph, b) they are relatively insensitive to missing values, and c) behave well with knowledge-based constraints. The results provide a basis for further investigation of the DHS data and for a deeper understanding of the behaviour of the structure learning algorithms when applied to real-world settings.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.00715v1
PDF https://arxiv.org/pdf/1912.00715v1.pdf
PWC https://paperswithcode.com/paper/learning-bayesian-networks-from-demographic
Repo
Framework

2D-CTC for Scene Text Recognition

Title 2D-CTC for Scene Text Recognition
Authors Zhaoyi Wan, Fengming Xie, Yibo Liu, Xiang Bai, Cong Yao
Abstract Scene text recognition has been an important, active research topic in computer vision for years. Previous approaches mainly consider text as 1D signals and cast scene text recognition as a sequence prediction problem, by feat of CTC or attention based encoder-decoder framework, which is originally designed for speech recognition. However, different from speech voices, which are 1D signals, text instances are essentially distributed in 2D image spaces. To adhere to and make use of the 2D nature of text for higher recognition accuracy, we extend the vanilla CTC model to a second dimension, thus creating 2D-CTC. 2D-CTC can adaptively concentrate on most relevant features while excluding the impact from clutters and noises in the background; It can also naturally handle text instances with various forms (horizontal, oriented and curved) while giving more interpretable intermediate predictions. The experiments on standard benchmarks for scene text recognition, such as IIIT-5K, ICDAR 2015, SVP-Perspective, and CUTE80, demonstrate that the proposed 2D-CTC model outperforms state-of-the-art methods on the text of both regular and irregular shapes. Moreover, 2D-CTC exhibits its superiority over prior art on training and testing speed. Our implementation and models of 2D-CTC will be made publicly available soon later.
Tasks Scene Text Recognition, Speech Recognition
Published 2019-07-23
URL https://arxiv.org/abs/1907.09705v1
PDF https://arxiv.org/pdf/1907.09705v1.pdf
PWC https://paperswithcode.com/paper/2d-ctc-for-scene-text-recognition
Repo
Framework

MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment

Title MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment
Authors Nikolai Ilinykh, Sina Zarrieß, David Schlangen
Abstract Building computer systems that can converse about their visual environment is one of the oldest concerns of research in Artificial Intelligence and Computational Linguistics (see, for example, Winograd’s 1972 SHRDLU system). Only recently, however, have methods from computer vision and natural language processing become powerful enough to make this vision seem more attainable. Pushed especially by developments in computer vision, many data sets and collection environments have recently been published that bring together verbal interaction and visual processing. Here, we argue that these datasets tend to oversimplify the dialogue part, and we propose a task—MeetUp!—that requires both visual and conversational grounding, and that makes stronger demands on representations of the discourse. MeetUp! is a two-player coordination game where players move in a visual environment, with the objective of finding each other. To do so, they must talk about what they see, and achieve mutual understanding. We describe a data collection and show that the resulting dialogues indeed exhibit the dialogue phenomena of interest, while also challenging the language & vision aspect.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05084v1
PDF https://arxiv.org/pdf/1907.05084v1.pdf
PWC https://paperswithcode.com/paper/meetup-a-corpus-of-joint-activity-dialogues
Repo
Framework
comments powered by Disqus