Paper Group ANR 303
RPC: A Large-Scale Retail Product Checkout Dataset. A machine learning method correlating pulse pressure wave data with pregnancy. Representation Learning on Unit Ball with 3D Roto-Translational Equivariance. Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning. Automatic Hierarchical Classification of Kelps us …
RPC: A Large-Scale Retail Product Checkout Dataset
Title | RPC: A Large-Scale Retail Product Checkout Dataset |
Authors | Xiu-Shen Wei, Quan Cui, Lei Yang, Peng Wang, Lingqiao Liu |
Abstract | Over recent years, emerging interest has occurred in integrating computer vision technology into the retail industry. Automatic checkout (ACO) is one of the critical problems in this area which aims to automatically generate the shopping list from the images of the products to purchase. The main challenge of this problem comes from the large scale and the fine-grained nature of the product categories as well as the difficulty for collecting training images that reflect the realistic checkout scenarios due to continuous update of the products. Despite its significant practical and research value, this problem is not extensively studied in the computer vision community, largely due to the lack of a high-quality dataset. To fill this gap, in this work we propose a new dataset to facilitate relevant research. Our dataset enjoys the following characteristics: (1) It is by far the largest dataset in terms of both product image quantity and product categories. (2) It includes single-product images taken in a controlled environment and multi-product images taken by the checkout system. (3) It provides different levels of annotations for the check-out images. Comparing with the existing datasets, ours is closer to the realistic setting and can derive a variety of research problems. Besides the dataset, we also benchmark the performance on this dataset with various approaches. The dataset and related resources can be found at \url{https://rpc-dataset.github.io/}. |
Tasks | |
Published | 2019-01-22 |
URL | http://arxiv.org/abs/1901.07249v1 |
http://arxiv.org/pdf/1901.07249v1.pdf | |
PWC | https://paperswithcode.com/paper/rpc-a-large-scale-retail-product-checkout |
Repo | |
Framework | |
A machine learning method correlating pulse pressure wave data with pregnancy
Title | A machine learning method correlating pulse pressure wave data with pregnancy |
Authors | Jianhong Chen, Huang Huang, Wenrui Hao, Jinchao Xu |
Abstract | Pulse feeling, representing the tactile arterial palpation of the heartbeat, has been widely used in traditional Chinese medicine (TCM) to diagnose various diseases. The quantitative relationship between the pulse wave and health conditions however has not been investigated in modern medicine. In this paper, we explored the correlation between pulse pressure wave (PPW), rather than the pulse key features in TCM, and pregnancy by using deep learning technology. This computational approach shows that the accuracy of pregnancy detection by the PPW is 84% with an AUC of 91%. Our study is a proof of concept of pulse diagnosis and will also motivate further sophisticated investigations on pulse waves. |
Tasks | |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01726v1 |
https://arxiv.org/pdf/1910.01726v1.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-method-correlating-pulse |
Repo | |
Framework | |
Representation Learning on Unit Ball with 3D Roto-Translational Equivariance
Title | Representation Learning on Unit Ball with 3D Roto-Translational Equivariance |
Authors | Sameera Ramasinghe, Salman Khan, Nick Barnes, Stephen Gould |
Abstract | Convolution is an integral operation that defines how the shape of one function is modified by another function. This powerful concept forms the basis of hierarchical feature learning in deep neural networks. Although performing convolution in Euclidean geometries is fairly straightforward, its extension to other topological spaces—such as a sphere ($\mathbb{S}^2$) or a unit ball ($\mathbb{B}^3$)—entails unique challenges. In this work, we propose a novel `\emph{volumetric convolution}’ operation that can effectively model and convolve arbitrary functions in $\mathbb{B}^3$. We develop a theoretical framework for \emph{volumetric convolution} based on Zernike polynomials and efficiently implement it as a differentiable and an easily pluggable layer in deep networks. By construction, our formulation leads to the derivation of a novel formula to measure the symmetry of a function in $\mathbb{B}^3$ around an arbitrary axis, that is useful in function analysis tasks. We demonstrate the efficacy of proposed volumetric convolution operation on one viable use case i.e., 3D object recognition. | |
Tasks | 3D Object Recognition, Object Recognition, Representation Learning |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.01454v1 |
https://arxiv.org/pdf/1912.01454v1.pdf | |
PWC | https://paperswithcode.com/paper/representation-learning-on-unit-ball-with-3d |
Repo | |
Framework | |
Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning
Title | Towards a General Model of Knowledge for Facial Analysis by Multi-Source Transfer Learning |
Authors | Valentin Vielzeuf, Alexis Lechervy, Stéphane Pateux, Frédéric Jurie |
Abstract | This paper proposes a step toward obtaining general models of knowledge for facial analysis, by addressing the question of multi-source transfer learning. More precisely, the proposed approach consists in two successive training steps: the first one consists in applying a combination operator to define a common embedding for the multiple sources materialized by different existing trained models. The proposed operator relies on an auto-encoder, trained on a large dataset, efficient both in terms of compression ratio and transfer learning performance. In a second step we exploit a distillation approach to obtain a lightweight student model mimicking the collection of the fused existing models. This model outperforms its teacher on novel tasks, achieving results on par with state-of-the-art methods on 15 facial analysis tasks (and domains), at an affordable training cost. Moreover, this student has 75 times less parameters than the original teacher and can be applied to a variety of novel face-related tasks. |
Tasks | Transfer Learning |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03222v1 |
https://arxiv.org/pdf/1911.03222v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-general-model-of-knowledge-for |
Repo | |
Framework | |
Automatic Hierarchical Classification of Kelps using Deep Residual Features
Title | Automatic Hierarchical Classification of Kelps using Deep Residual Features |
Authors | Ammar Mahmood, Ana Giraldo Ospina, Mohammed Bennamoun, Senjian An, Ferdous Sohel, Farid Boussaid, Renae Hovey, Robert B. Fisher, Gary Kendrick |
Abstract | Across the globe, remote image data is rapidly being collected for the assessment of benthic communities from shallow to extremely deep waters on continental slopes to the abyssal seas. Exploiting this data is presently limited by the time it takes for experts to identify organisms found in these images. With this limitation in mind, a large effort has been made globally to introduce automation and machine learning algorithms to accelerate both classification and assessment of marine benthic biota. One major issue lies with organisms that move with swell and currents, like kelps. This paper presents an automatic hierarchical classification method (local binary classification as opposed to the conventional flat classification) to classify kelps in images collected by autonomous underwater vehicles. The proposed kelp classification approach exploits learned feature representations extracted from deep residual networks. We show that these generic features outperform the traditional off-the-shelf CNN features and the conventional hand-crafted features. Experiments also demonstrate that the hierarchical classification method outperforms the traditional parallel multi-class classifications by a significant margin (90.0% vs 57.6% and 77.2% vs 59.0%) on Benthoz15 and Rottnest datasets respectively. Furthermore, we compare different hierarchical classification approaches and experimentally show that the sibling hierarchical training approach outperforms the inclusive hierarchical approach by a significant margin. We also report an application of our proposed method to study the change in kelp cover over time for annually repeated AUV surveys. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.10881v2 |
https://arxiv.org/pdf/1906.10881v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-classification-of-kelps |
Repo | |
Framework | |
FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data
Title | FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data |
Authors | Georg Krispel, Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof |
Abstract | We introduce a simple yet effective fusion method of LiDAR and RGB data to segment LiDAR point clouds. Utilizing the dense native range representation of a LiDAR sensor and the setup calibration, we establish point correspondences between the two input modalities. Subsequently, we are able to warp and fuse the features from one domain into the other. Therefore, we can jointly exploit information from both data sources within one single network. To show the merit of our method, we extend SqueezeSeg, a point cloud segmentation network, with an RGB feature branch and fuse it into the original structure. Our extension called FuseSeg leads to an improvement of up to 18% IoU on the KITTI benchmark. In addition to the improved accuracy, we also achieve real-time performance at 50 fps, five times as fast as the KITTI LiDAR data recording speed. |
Tasks | Calibration |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08487v2 |
https://arxiv.org/pdf/1912.08487v2.pdf | |
PWC | https://paperswithcode.com/paper/fuseseg-lidar-point-cloud-segmentation-fusing |
Repo | |
Framework | |
Deep Learning Based Robot for Automatically Picking up Garbage on the Grass
Title | Deep Learning Based Robot for Automatically Picking up Garbage on the Grass |
Authors | Jinqiang Bai, Shiguo Lian, Zhaoxiang Liu, Kai Wang, Dijun Liu |
Abstract | This paper presents a novel garbage pickup robot which operates on the grass. The robot is able to detect the garbage accurately and autonomously by using a deep neural network for garbage recognition. In addition, with the ground segmentation using a deep neural network, a novel navigation strategy is proposed to guide the robot to move around. With the garbage recognition and automatic navigation functions, the robot can clean garbage on the ground in places like parks or schools efficiently and autonomously. Experimental results show that the garbage recognition accuracy can reach as high as 95%, and even without path planning, the navigation strategy can reach almost the same cleaning efficiency with traditional methods. Thus, the proposed robot can serve as a good assistance to relieve dustman’s physical labor on garbage cleaning tasks. |
Tasks | |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1904.13034v1 |
http://arxiv.org/pdf/1904.13034v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-robot-for-automatically |
Repo | |
Framework | |
A simple and effective hybrid genetic search for the job sequencing and tool switching problem
Title | A simple and effective hybrid genetic search for the job sequencing and tool switching problem |
Authors | Jordana Mecler, Anand Subramanian, Thibaut Vidal |
Abstract | The job sequencing and tool switching problem (SSP) has been extensively studied in the field of operations research, due to its practical relevance and methodological interest. Given a machine that can load a limited amount of tools simultaneously and a number of jobs that require a subset of the available tools, the SSP seeks a job sequence that minimizes the number of tool switches in the machine. To solve this problem, we propose a simple and efficient hybrid genetic search based on a generic solution representation, a tailored decoding operator, efficient local searches and diversity management techniques. To guide the search, we introduce a secondary objective designed to break ties. These techniques allow to explore structurally different solutions and escape local optima. As shown in our computational experiments on classical benchmark instances, our algorithm significantly outperforms all previous approaches while remaining simple to apprehend and easy to implement. We finally report results on a new set of larger instances to stimulate future research and comparative analyses. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.10021v1 |
https://arxiv.org/pdf/1910.10021v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-and-effective-hybrid-genetic-search |
Repo | |
Framework | |
An agglomerative hierarchical clustering method by optimizing the average silhouette width
Title | An agglomerative hierarchical clustering method by optimizing the average silhouette width |
Authors | Fatima Batool |
Abstract | An agglomerative hierarchical clustering (AHC) framework and algorithm named HOSil based on a new linkage metric optimized by the average silhouette width (ASW) index is proposed. A conscientious investigation of various clustering methods and estimation indices is conducted across a diverse verities of data structures for three aims: a) clustering quality, b) clustering recovery, and c) estimation of number of clusters. HOSil has shown better clustering quality for a range of artificial and real world data structures as compared to k-means, PAM, single, complete, average, Ward, McQuitty, spectral, model-based, and several estimation methods. It can identify clusters of various shapes including spherical, elongated, relatively small sized clusters, clusters coming from different distributions including uniform, t, gamma and others. HOSil has shown good recovery for correct determination of the number of clusters. For some data structures only HOSil was able to identify the correct number of clusters. |
Tasks | |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12356v1 |
https://arxiv.org/pdf/1909.12356v1.pdf | |
PWC | https://paperswithcode.com/paper/an-agglomerative-hierarchical-clustering-1 |
Repo | |
Framework | |
Fault Diagnosis Method Based on Scaling Law for On-line Refrigerant Leak Detection
Title | Fault Diagnosis Method Based on Scaling Law for On-line Refrigerant Leak Detection |
Authors | Shun Takeuchi, Takahiro Saito |
Abstract | Early fault detection using instrumented sensor data is one of the promising application areas of machine learning in industrial facilities. However, it is difficult to improve the generalization performance of the trained fault-detection model because of the complex system configuration in the target diagnostic system and insufficient fault data. It is not trivial to apply the trained model to other systems. Here we propose a fault diagnosis method for refrigerant leak detection considering the physical modeling and control mechanism of an air-conditioning system. We derive a useful scaling law related to refrigerant leak. If the control mechanism is the same, the model can be applied to other air-conditioning systems irrespective of the system configuration. Small-scale off-line fault test data obtained in a laboratory are applied to estimate the scaling exponent. We evaluate the proposed scaling law by using real-world data. Based on a statistical hypothesis test of the interaction between two groups, we show that the scaling exponents of different air-conditioning systems are equivalent. In addition, we estimated the time series of the degree of leakage of real process data based on the scaling law and confirmed that the proposed method is promising for early leak detection through comparison with assessment by experts. |
Tasks | Fault Detection, Time Series |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.09427v1 |
http://arxiv.org/pdf/1902.09427v1.pdf | |
PWC | https://paperswithcode.com/paper/fault-diagnosis-method-based-on-scaling-law |
Repo | |
Framework | |
Weakly-Supervised Opinion Summarization by Leveraging External Information
Title | Weakly-Supervised Opinion Summarization by Leveraging External Information |
Authors | Chao Zhao, Snigdha Chaturvedi |
Abstract | Opinion summarization from online product reviews is a challenging task, which involves identifying opinions related to various aspects of the product being reviewed. While previous works require additional human effort to identify relevant aspects, we instead apply domain knowledge from external sources to automatically achieve the same goal. This work proposes AspMem, a generative method that contains an array of memory cells to store aspect-related knowledge. This explicit memory can help obtain a better opinion representation and infer the aspect information more precisely. We evaluate this method on both aspect identification and opinion summarization tasks. Our experiments show that AspMem outperforms the state-of-the-art methods even though, unlike the baselines, it does not rely on human supervision which is carefully handcrafted for the given tasks. |
Tasks | |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1911.09844v1 |
https://arxiv.org/pdf/1911.09844v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-opinion-summarization-by |
Repo | |
Framework | |
Unbiased CVR Prediction from Biased Conversions in Display Advertising
Title | Unbiased CVR Prediction from Biased Conversions in Display Advertising |
Authors | Yuta Saito, Gota Morishita, Shota Yasui |
Abstract | In display advertising, predicting the conversion rate is critical to deciding the optimal bid price for an advertisement. There are two troublesome difficulties in the conversion rate prediction task in the display advertising domain. First, some positive conversions are falsely observed as a negative label in training data, because they do not occur right after clicking the ads. Moreover, some positive feedback is much more frequently observed than the others, which creates an ununiform missing mechanism of conversions. It is widely acknowledged that these problems cause a severe bias in the naive empirical average loss function for the conversion rate prediction. To overcome the challenges, we formulate the conversion rate prediction task in display advertising from the statistical estimation perspective and propose an interactive learning algorithm where a conversion rate predictor and a bias estimator are learned alternately. Lastly, we conducted a simulation experiment to demonstrate that the proposed method outperforms the existing baseline models. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01847v2 |
https://arxiv.org/pdf/1910.01847v2.pdf | |
PWC | https://paperswithcode.com/paper/dual-learning-algorithm-for-delayed-feedback |
Repo | |
Framework | |
Learning Bayesian networks from demographic and health survey data
Title | Learning Bayesian networks from demographic and health survey data |
Authors | Neville Kenneth Kitson, Anthony C. Constantinou |
Abstract | Child mortality from preventable diseases such as pneumonia and diarrhoea in low and middle-income countries remains a serious global challenge. We combine knowledge with available Demographic and Health Survey (DHS) data from India, to construct Bayesian Networks (BNs) and investigate the factors associated with childhood diarrhoea. We make use of freeware tools to learn the graphical structure of the DHS data with score-based, constraint-based, and hybrid structure learning algorithms. We investigate the effect of missing values, sample size, and knowledge-based constraints on each of the structure learning algorithms and assess their accuracy with multiple scoring functions. Weaknesses in the survey methodology and data available, as well as the variability in the BNs generated, mean that is not possible to learn a definitive causal BN from data. However, knowledge-based constraints are found to be useful in reducing the variation in the graphs produced by the different algorithms, and produce graphs which are more reflective of the likely influential relationships in the data. Furthermore, valuable insights are gained into the performance and characteristics of the structure learning algorithms. Two score-based algorithms in particular, TABU and FGES, demonstrate many desirable qualities; a) with sufficient data, they produce a graph which is similar to the reference graph, b) they are relatively insensitive to missing values, and c) behave well with knowledge-based constraints. The results provide a basis for further investigation of the DHS data and for a deeper understanding of the behaviour of the structure learning algorithms when applied to real-world settings. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00715v1 |
https://arxiv.org/pdf/1912.00715v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-bayesian-networks-from-demographic |
Repo | |
Framework | |
2D-CTC for Scene Text Recognition
Title | 2D-CTC for Scene Text Recognition |
Authors | Zhaoyi Wan, Fengming Xie, Yibo Liu, Xiang Bai, Cong Yao |
Abstract | Scene text recognition has been an important, active research topic in computer vision for years. Previous approaches mainly consider text as 1D signals and cast scene text recognition as a sequence prediction problem, by feat of CTC or attention based encoder-decoder framework, which is originally designed for speech recognition. However, different from speech voices, which are 1D signals, text instances are essentially distributed in 2D image spaces. To adhere to and make use of the 2D nature of text for higher recognition accuracy, we extend the vanilla CTC model to a second dimension, thus creating 2D-CTC. 2D-CTC can adaptively concentrate on most relevant features while excluding the impact from clutters and noises in the background; It can also naturally handle text instances with various forms (horizontal, oriented and curved) while giving more interpretable intermediate predictions. The experiments on standard benchmarks for scene text recognition, such as IIIT-5K, ICDAR 2015, SVP-Perspective, and CUTE80, demonstrate that the proposed 2D-CTC model outperforms state-of-the-art methods on the text of both regular and irregular shapes. Moreover, 2D-CTC exhibits its superiority over prior art on training and testing speed. Our implementation and models of 2D-CTC will be made publicly available soon later. |
Tasks | Scene Text Recognition, Speech Recognition |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09705v1 |
https://arxiv.org/pdf/1907.09705v1.pdf | |
PWC | https://paperswithcode.com/paper/2d-ctc-for-scene-text-recognition |
Repo | |
Framework | |
MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment
Title | MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment |
Authors | Nikolai Ilinykh, Sina Zarrieß, David Schlangen |
Abstract | Building computer systems that can converse about their visual environment is one of the oldest concerns of research in Artificial Intelligence and Computational Linguistics (see, for example, Winograd’s 1972 SHRDLU system). Only recently, however, have methods from computer vision and natural language processing become powerful enough to make this vision seem more attainable. Pushed especially by developments in computer vision, many data sets and collection environments have recently been published that bring together verbal interaction and visual processing. Here, we argue that these datasets tend to oversimplify the dialogue part, and we propose a task—MeetUp!—that requires both visual and conversational grounding, and that makes stronger demands on representations of the discourse. MeetUp! is a two-player coordination game where players move in a visual environment, with the objective of finding each other. To do so, they must talk about what they see, and achieve mutual understanding. We describe a data collection and show that the resulting dialogues indeed exhibit the dialogue phenomena of interest, while also challenging the language & vision aspect. |
Tasks | |
Published | 2019-07-11 |
URL | https://arxiv.org/abs/1907.05084v1 |
https://arxiv.org/pdf/1907.05084v1.pdf | |
PWC | https://paperswithcode.com/paper/meetup-a-corpus-of-joint-activity-dialogues |
Repo | |
Framework | |