Paper Group ANR 752
A Robust Iris Authentication System on GPU-Based Edge Devices using Multi-Modalities Learning Model. Alignment Free and Distortion Robust Iris Recognition. Optimized data exploration applied to the simulation of a chemical process. Only Time Can Tell: Discovering Temporal Data for Temporal Modeling. Merging variables: one technique of search in pse …
A Robust Iris Authentication System on GPU-Based Edge Devices using Multi-Modalities Learning Model
Title | A Robust Iris Authentication System on GPU-Based Edge Devices using Multi-Modalities Learning Model |
Authors | Siming Zheng, Rahmita Wirza O. K. Rahmat, Fatimah Khalid, Nurul Amelina Nasharuddin |
Abstract | In recent years, mobile Internet has accelerated the proliferation of smart mobile development. The mobile payment, mobile security and privacy protection have become the focus of widespread attention. Iris recognition becomes a high-security authentication technology in these fields, it is widely used in distinct science fields in biometric authentication fields. The Convolutional Neural Network (CNN) is one of the mainstream deep learning approaches for image recognition, whereas its anti-noise ability is weak and needs a certain amount of memory to train in image classification tasks. Under these conditions we put forward a fine-tuning neural network model based on the Mask R-CNN and Inception V4 neural network model, which integrates every component in an overall system that combines the iris detection, extraction, and recognition function as an iris recognition system. The proposed framework has the characteristics of scalability and high availability; it not only can learn part-whole relationships of the iris image but also enhancing the robustness of the whole framework. Importantly, the proposed model can be trained using the different spectrum of samples, such as Visible Wavelength (VW) and Near Infrared (NIR) iris biometric databases. The recognition average accuracy of 99.10% is achieved while executing in the mobile edge calculation device of the Jetson Nano. |
Tasks | Image Classification, Iris Recognition, Mobile Security |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00756v1 |
https://arxiv.org/pdf/1912.00756v1.pdf | |
PWC | https://paperswithcode.com/paper/a-robust-iris-authentication-system-on-gpu |
Repo | |
Framework | |
Alignment Free and Distortion Robust Iris Recognition
Title | Alignment Free and Distortion Robust Iris Recognition |
Authors | Min Ren, Caiyong Wang, Yunlong Wang, Zhenan Sun, Tieniu Tan |
Abstract | Iris recognition is a reliable personal identification method but there is still much room to improve its accuracy especially in less-constrained situations. For example, free movement of head pose may cause large rotation difference between iris images. And illumination variations may cause irregular distortion of iris texture. To match intra-class iris images with head rotation robustly, the existing solutions usually need a precise alignment operation by exhaustive search within a determined range in iris image preprosessing or brute force searching the minimum Hamming distance in iris feature matching. In the wild, iris rotation is of much greater uncertainty than that in constrained situations and exhaustive search within a determined range is impracticable. This paper presents a unified feature-level solution to both alignment free and distortion robust iris recognition in the wild. A new deep learning based method named Alignment Free Iris Network (AFINet) is proposed, which uses a trainable VLAD (Vector of Locally Aggregated Descriptors) encoder called NetVLAD to decouple the correlations between local representations and their spatial positions. And deformable convolution is used to overcome iris texture distortion by dense adaptive sampling. The results of extensive experiments on three public iris image databases and the simulated degradation databases show that AFINet significantly outperforms state-of-art iris recognition methods. |
Tasks | Iris Recognition |
Published | 2019-12-01 |
URL | https://arxiv.org/abs/1912.00382v1 |
https://arxiv.org/pdf/1912.00382v1.pdf | |
PWC | https://paperswithcode.com/paper/alignment-free-and-distortion-robust-iris |
Repo | |
Framework | |
Optimized data exploration applied to the simulation of a chemical process
Title | Optimized data exploration applied to the simulation of a chemical process |
Authors | Raoul Heese, Michal Walczak, Tobias Seidel, Norbert Asprion, Michael Bortz |
Abstract | In complex simulation environments, certain parameter space regions may result in non-convergent or unphysical outcomes. All parameters can therefore be labeled with a binary class describing whether or not they lead to valid results. In general, it can be very difficult to determine feasible parameter regions, especially without previous knowledge. We propose a novel algorithm to explore such an unknown parameter space and improve its feasibility classification in an iterative way. Moreover, we include an additional optimization target in the algorithm to guide the exploration towards regions of interest and to improve the classification therein. In our method we make use of well-established concepts from the field of machine learning like kernel support vector machines and kernel ridge regression. From a comparison with a Kriging-based exploration approach based on recently published results we can show the advantages of our algorithm in a binary feasibility classification scenario with a discrete feasibility constraint violation. In this context, we also propose an improvement of the Kriging-based exploration approach. We apply our novel method to a fully realistic, industrially relevant chemical process simulation to demonstrate its practical usability and find a comparably good approximation of the data space topology from relatively few data points. |
Tasks | |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06453v1 |
http://arxiv.org/pdf/1902.06453v1.pdf | |
PWC | https://paperswithcode.com/paper/optimized-data-exploration-applied-to-the |
Repo | |
Framework | |
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Title | Only Time Can Tell: Discovering Temporal Data for Temporal Modeling |
Authors | Laura Sevilla-Lara, Shengxin Zha, Zhicheng Yan, Vedanuj Goswami, Matt Feiszli, Lorenzo Torresani |
Abstract | Understanding temporal information and how the visual world changes over time is a fundamental ability of intelligent systems. In video understanding, temporal information is at the core of many current challenges, including compression, efficient inference, motion estimation or summarization. However, in current video datasets it has been observed that action classes can often be recognized without any temporal information from a single frame of video. As a result, both benchmarking and training in these datasets may give an unintentional advantage to models with strong image understanding capabilities, as opposed to those with strong temporal understanding. In this paper we address this problem head on by identifying action classes where temporal information is actually necessary to recognize them and call these “temporal classes”. Selecting temporal classes using a computational method would bias the process. Instead, we propose a methodology based on a simple and effective human annotation experiment. We remove just the temporal information by shuffling frames in time and measure if the action can still be recognized. Classes that cannot be recognized when frames are not in order are included in the temporal Dataset. We observe that this set is statistically different from other static classes, and that performance in it correlates with a network’s ability to capture temporal information. Thus we use it as a benchmark on current popular networks, which reveals a series of interesting facts. We also explore the effect of training on the temporal dataset, and observe that this leads to better generalization in unseen classes, demonstrating the need for more temporal data. We hope that the proposed dataset of temporal categories will help guide future research in temporal modeling for better video understanding. |
Tasks | Motion Estimation, Video Understanding |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08340v2 |
https://arxiv.org/pdf/1907.08340v2.pdf | |
PWC | https://paperswithcode.com/paper/only-time-can-tell-discovering-temporal-data |
Repo | |
Framework | |
Merging variables: one technique of search in pseudo-Boolean optimization
Title | Merging variables: one technique of search in pseudo-Boolean optimization |
Authors | Alexander A. Semenov |
Abstract | In the present paper we describe new heuristic technique, which can be applied to the optimization of pseudo-Boolean functions including Black-Box functions. This technique is based on a simple procedure which consists in transition from the optimization problem over Boolean hypercube to the optimization problem of auxiliary function in a specially constructed metric space. It is shown that there is a natural connection between the points of the original Boolean hypercube and points from the new metric space. For the Boolean hypercube with fixed dimension it is possible to construct a number of such metric spaces. The proposed technique can be considered as a special case of Variable Neighborhood Search, which is focused on pseudo-Boolean optimization. Preliminary computational results show high efficiency of the proposed technique on some reasonably hard problems. Also it is shown that the described technique in combination with the well-known (1+1)-Evolutionary Algorithm allows to decrease the upper bound on the runtime of this algorithm for arbitrary pseudo-Boolean functions. |
Tasks | |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.00751v1 |
https://arxiv.org/pdf/1908.00751v1.pdf | |
PWC | https://paperswithcode.com/paper/merging-variables-one-technique-of-search-in |
Repo | |
Framework | |
Dual-reference Age Synthesis
Title | Dual-reference Age Synthesis |
Authors | Yuan Zhou, Bingzhang Hu, and Jun He, Yu Guan, Ling Shao |
Abstract | Age synthesis methods typically take a single image as input and use a specific number to control the age of the generated image. In this paper, we propose a novel framework taking two images as inputs, named dual-reference age synthesis (DRAS), which approaches the task differently; instead of using “hard” age information, i.e. a fixed number, our model determines the target age in a “soft” way, by employing a second reference image. Specifically, the proposed framework consists of an identity agent, an age agent and a generative adversarial network. It takes two images as input - an identity reference and an age reference - and outputs a new image that shares corresponding features with each. Experimental results on two benchmark datasets (UTKFace and CACD) demonstrate the appealing performance and flexibility of the proposed framework. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02671v2 |
https://arxiv.org/pdf/1908.02671v2.pdf | |
PWC | https://paperswithcode.com/paper/dual-reference-age-synthesis |
Repo | |
Framework | |
Experiments with a PCCoder extension
Title | Experiments with a PCCoder extension |
Authors | Mircea-Dan Hernest |
Abstract | Recent research in synthesis of programs written in some Domain Specific Language (DSL) by means of neural networks from a limited set of inputs-output correspondences such as DeepCoder and its PCCoder reimplementation/optimization proved the efficiency of this kind of approach to automatic program generation in a DSL language that although limited in scope is universal in the sense that programs can be translated to basically any programming language. We experiment with the extension of the DSL of DeepCoder/PCCoder with symbols IFI and IFL which denote functional expressions of the If ramification (test) instruction for types Int and List. We notice an increase (doubling) of the size of the training set, the number of parameters of the trained neural network and of the time spent looking for the program synthesized from limited sets of inputs-output correspondences. The result is positive in the sense of preserving the accuracy of applying synthesis on randomly generated test sets. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1912.00781v1 |
https://arxiv.org/pdf/1912.00781v1.pdf | |
PWC | https://paperswithcode.com/paper/experiments-with-a-pccoder-extension |
Repo | |
Framework | |
Personalized Ranking in eCommerce Search
Title | Personalized Ranking in eCommerce Search |
Authors | Grigor Aslanyan, Aritra Mandal, Prathyusha Senthil Kumar, Amit Jaiswal, Manojkumar Rangasamy Kannadasan |
Abstract | We address the problem of personalization in the context of eCommerce search. Specifically, we develop personalization ranking features that use in-session context to augment a generic ranker optimized for conversion and relevance. We use a combination of latent features learned from item co-clicks in historic sessions and content-based features that use item title and price. Personalization in search has been discussed extensively in the existing literature. The novelty of our work is combining and comparing content-based and content-agnostic features and showing that they complement each other to result in a significant improvement of the ranker. Moreover, our technique does not require an explicit re-ranking step, does not rely on learning user profiles from long term search behavior, and does not involve complex modeling of query-item-user features. Our approach captures item co-click propensity using lightweight item embeddings. We experimentally show that our technique significantly outperforms a generic ranker in terms of Mean Reciprocal Rank (MRR). We also provide anecdotal evidence for the semantic similarity captured by the item embeddings on the eBay search engine. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1905.00052v1 |
http://arxiv.org/pdf/1905.00052v1.pdf | |
PWC | https://paperswithcode.com/paper/personalized-ranking-in-ecommerce-search |
Repo | |
Framework | |
Diagonal Acceleration for Covariance Matrix Adaptation Evolution Strategies
Title | Diagonal Acceleration for Covariance Matrix Adaptation Evolution Strategies |
Authors | Youhei Akimoto, Nikolaus Hansen |
Abstract | We introduce an acceleration for covariance matrix adaptation evolution strategies (CMA-ES) by means of adaptive diagonal decoding (dd-CMA). This diagonal acceleration endows the default CMA-ES with the advantages of separable CMA-ES without inheriting its drawbacks. Technically, we introduce a diagonal matrix D that expresses coordinate-wise variances of the sampling distribution in DCD form. The diagonal matrix can learn a rescaling of the problem in the coordinates within linear number of function evaluations. Diagonal decoding can also exploit separability of the problem, but, crucially, does not compromise the performance on non-separable problems. The latter is accomplished by modulating the learning rate for the diagonal matrix based on the condition number of the underlying correlation matrix. dd-CMA-ES not only combines the advantages of default and separable CMA-ES, but may achieve overadditive speedup: it improves the performance, and even the scaling, of the better of default and separable CMA-ES on classes of non-separable test functions that reflect, arguably, a landscape feature commonly observed in practice. The paper makes two further secondary contributions: we introduce two different approaches to guarantee positive definiteness of the covariance matrix with active CMA, which is valuable in particular with large population size; we revise the default parameter setting in CMA-ES, proposing accelerated settings in particular for large dimension. All our contributions can be viewed as independent improvements of CMA-ES, yet they are also complementary and can be seamlessly combined. In numerical experiments with dd-CMA-ES up to dimension 5120, we observe remarkable improvements over the original covariance matrix adaptation on functions with coordinate-wise ill-conditioning. The improvement is observed also for large population sizes up to about dimension squared. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05885v1 |
https://arxiv.org/pdf/1905.05885v1.pdf | |
PWC | https://paperswithcode.com/paper/diagonal-acceleration-for-covariance-matrix |
Repo | |
Framework | |
Micro-expression detection in long videos using optical flow and recurrent neural networks
Title | Micro-expression detection in long videos using optical flow and recurrent neural networks |
Authors | Michiel Verburg, Vlado Menkovski |
Abstract | Facial micro-expressions are subtle and involuntary expressions that can reveal concealed emotions. Micro-expressions are an invaluable source of information in application domains such as lie detection, mental health, sentiment analysis and more. One of the biggest challenges in this field of research is the small amount of available spontaneous micro-expression data. However, spontaneous data collection is burdened by time-consuming and expensive annotation. Hence, methods are needed which can reduce the amount of data that annotators have to review. This paper presents a novel micro-expression spotting method using a recurrent neural network (RNN) on optical flow features. We extract Histogram of Oriented Optical Flow (HOOF) features to encode the temporal changes in selected face regions. Finally, the RNN spots short intervals which are likely to contain occurrences of relevant facial micro-movements. The proposed method is evaluated on the SAMM database. Any chance of subject bias is eliminated by training the RNN using Leave-One-Subject-Out cross-validation. Comparing the spotted intervals with the labeled data shows that the method produced 1569 false positives while obtaining a recall of 0.4654. The initial results show that the proposed method would reduce the video length by a factor of 3.5, while still retaining almost half of the relevant micro-movements. Lastly, as the model gets more data, it becomes better at detecting intervals, which makes the proposed method suitable for supporting the annotation process. |
Tasks | Optical Flow Estimation, Sentiment Analysis |
Published | 2019-03-26 |
URL | http://arxiv.org/abs/1903.10765v1 |
http://arxiv.org/pdf/1903.10765v1.pdf | |
PWC | https://paperswithcode.com/paper/micro-expression-detection-in-long-videos |
Repo | |
Framework | |
Controllable Face Aging
Title | Controllable Face Aging |
Authors | Haien Zeng, Hanjiang Lai, Jian Yin |
Abstract | Motivated by the following two observations: 1) people are aging differently under different conditions for changeable facial attributes, e.g., skin color may become darker when working outside, and 2) it needs to keep some unchanged facial attributes during the aging process, e.g., race and gender, we propose a controllable face aging method via attribute disentanglement generative adversarial network. To offer fine control over the synthesized face images, first, an individual embedding of the face is directly learned from an image that contains the desired facial attribute. Second, since the image may contain other unwanted attributes, an attribute disentanglement network is used to separate the individual embedding and learn the common embedding that contains information about the face attribute (e.g., race). With the common embedding, we can manipulate the generated face image with the desired attribute in an explicit manner. Experimental results on two common benchmarks demonstrate that our proposed generator achieves comparable performance on the aging effect with state-of-the-art baselines while gaining more flexibility for attribute control. Code is available at supplementary material. |
Tasks | |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.09694v1 |
https://arxiv.org/pdf/1912.09694v1.pdf | |
PWC | https://paperswithcode.com/paper/controllable-face-aging |
Repo | |
Framework | |
Event-based Star Tracking via Multiresolution Progressive Hough Transforms
Title | Event-based Star Tracking via Multiresolution Progressive Hough Transforms |
Authors | Samya Bagchi, Tat-Jun Chin |
Abstract | Star trackers are state-of-the-art attitude estimation devices which function by recognising and tracking star patterns. Most commercial star trackers use conventional optical sensors. A recent alternative is to use event sensors, which could enable more energy efficient and faster star trackers. However, this demands new algorithms that can efficiently cope with high-speed asynchronous data, and are feasible on resource-constrained computing platforms. To this end, we propose an event-based processing approach for star tracking. Our technique operates on the event stream from a star field, by using multiresolution Hough Transforms to time-progressively integrate event data and produce accurate relative rotations. Optimisation via rotation averaging is then used to fuse the relative rotations and jointly refine the absolute orientations. Our technique is designed to be feasible for asynchronous operation on standard hardware. Moreover, compared to state-of-the-art event-based motion estimation schemes, our technique is much more efficient and accurate. |
Tasks | Motion Estimation |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07866v2 |
https://arxiv.org/pdf/1906.07866v2.pdf | |
PWC | https://paperswithcode.com/paper/event-based-star-tracking-via-multiresolution |
Repo | |
Framework | |
A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations
Title | A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations |
Authors | Mingda Chen, Qingming Tang, Sam Wiseman, Kevin Gimpel |
Abstract | We propose a generative model for a sentence that uses two latent variables, with one intended to represent the syntax of the sentence and the other to represent its semantics. We show we can achieve better disentanglement between semantic and syntactic representations by training with multiple losses, including losses that exploit aligned paraphrastic sentences and word-order information. We also investigate the effect of moving from bag-of-words to recurrent neural network modules. We evaluate our models as well as several popular pretrained embeddings on standard semantic similarity tasks and novel syntactic similarity tasks. Empirically, we find that the model with the best performing syntactic and semantic representations also gives rise to the most disentangled representations. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01173v1 |
http://arxiv.org/pdf/1904.01173v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-task-approach-for-disentangling |
Repo | |
Framework | |
Cross-modal Subspace Learning via Kernel Correlation Maximization and Discriminative Structure Preserving
Title | Cross-modal Subspace Learning via Kernel Correlation Maximization and Discriminative Structure Preserving |
Authors | Jun Yu, Xiao-Jun Wu |
Abstract | The measure between heterogeneous data is still an open problem. Many research works have been developed to learn a common subspace where the similarity between different modalities can be calculated directly. However, most of existing works focus on learning a latent subspace but the semantically structural information is not well preserved. Thus, these approaches cannot get desired results. In this paper, we propose a novel framework, termed Cross-modal subspace learning via Kernel correlation maximization and Discriminative structure-preserving (CKD), to solve this problem in two aspects. Firstly, we construct a shared semantic graph to make each modality data preserve the neighbor relationship semantically. Secondly, we introduce the Hilbert-Schmidt Independence Criteria (HSIC) to ensure the consistency between feature-similarity and semantic-similarity of samples. Our model not only considers the inter-modality correlation by maximizing the kernel correlation but also preserves the semantically structural information within each modality. The extensive experiments are performed to evaluate the proposed framework on the three public datasets. The experimental results demonstrated that the proposed CKD is competitive compared with the classic subspace learning methods. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2019-03-26 |
URL | https://arxiv.org/abs/1904.00776v3 |
https://arxiv.org/pdf/1904.00776v3.pdf | |
PWC | https://paperswithcode.com/paper/cross-modal-subspace-learning-with-kernel |
Repo | |
Framework | |
Semantic Comparison of State-of-the-Art Deep Learning APIs for Image Multi-Label Classification
Title | Semantic Comparison of State-of-the-Art Deep Learning APIs for Image Multi-Label Classification |
Authors | Adam Kubany, Shimon Ben Ishay, Ruben-sacha Ohayon, Armin Shmilovici, Lior Rokach, Tomer Doitshman |
Abstract | Image understanding relies heavily on accurate multi-label classification. In recent years, deep learning (DL) algorithms have become very successful tools for multi-label classification of image objects, and various implementations of DL algorithms have been released for public use in the form of application programming interfaces (APIs). In this study, we evaluate and compare 10 of the most prominent publicly available APIs in a best-of-breed challenge. The evaluation is performed on the Visual Genome labeling benchmark dataset using 12 well-recognized similarity metrics. In addition, for the first time in this kind of comparison, we use a semantic similarity metric to evaluate the semantic similarity performance of these APIs. In this evaluation, Microsoft’s Computer Vision, TensorFlow, Imagga, and IBM’s Visual Recognition performed better than the other APIs. Furthermore, the new semantic similarity metric provided deeper insights for comparison. |
Tasks | Multi-Label Classification, Semantic Similarity, Semantic Textual Similarity |
Published | 2019-03-21 |
URL | https://arxiv.org/abs/1903.09190v3 |
https://arxiv.org/pdf/1903.09190v3.pdf | |
PWC | https://paperswithcode.com/paper/semantic-comparison-of-state-of-the-art-deep |
Repo | |
Framework | |