Paper Group ANR 1608
Image Alignment in Unseen Domains via Domain Deep Generalization. An Introduction to Symbolic Artificial Intelligence Applied to Multimedia. Robust Federated Learning with Noisy Communication. Transfer Learning in Spatial-Temporal Forecasting of the Solar Magnetic Field. Corn leaf detection using Region based convolutional neural network. Towards t …
Image Alignment in Unseen Domains via Domain Deep Generalization
Title | Image Alignment in Unseen Domains via Domain Deep Generalization |
Authors | Thanh-Dat Truong, Khoa Luu, Chi Nhan Duong, Ngan Le, Minh-Triet Tran |
Abstract | Image alignment across domains has recently become one of the realistic and popular topics in the research community. In this problem, a deep learning-based image alignment method is usually trained on an available largescale database. During the testing steps, this trained model is deployed on unseen images collected under different camera conditions and modalities. The delivered deep network models are unable to be updated, adapted or fine-tuned in these scenarios. Thus, recent deep learning techniques, e.g. domain adaptation, feature transferring, and fine-tuning, are unable to be deployed. This paper presents a novel deep learning based approach to tackle the problem of across unseen modalities. The proposed network is then applied to image alignment as an illustration. The proposed approach is designed as an end-to-end deep convolutional neural network to optimize the deep models to improve the performance. The proposed network has been evaluated in digit recognition when the model is trained on MNIST and then tested on unseen domain MNIST-M. Finally, the proposed method is benchmarked in image alignment problem when training on RGB images and testing on Depth and X-Ray images. |
Tasks | Domain Adaptation |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12028v2 |
https://arxiv.org/pdf/1905.12028v2.pdf | |
PWC | https://paperswithcode.com/paper/image-alignment-in-unseen-domains-via-domain |
Repo | |
Framework | |
An Introduction to Symbolic Artificial Intelligence Applied to Multimedia
Title | An Introduction to Symbolic Artificial Intelligence Applied to Multimedia |
Authors | Guilherme Lima, Rodrigo Costa, Marcio Ferreira Moreno |
Abstract | In this chapter, we give an introduction to symbolic artificial intelligence (AI) and discuss its relation and application to multimedia. We begin by defining what symbolic AI is, what distinguishes it from non-symbolic approaches, such as machine learning, and how it can used in the construction of advanced multimedia applications. We then introduce description logic (DL) and use it to discuss symbolic representation and reasoning. DL is the logical underpinning of OWL, the most successful family of ontology languages. After discussing DL, we present OWL and related Semantic Web technologies, such as RDF and SPARQL. We conclude the chapter by discussing a hybrid model for multimedia representation, called Hyperknowledge. Throughout the text, we make references to technologies and extensions specifically designed to solve the kinds of problems that arise in multimedia representation. |
Tasks | |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09606v2 |
https://arxiv.org/pdf/1911.09606v2.pdf | |
PWC | https://paperswithcode.com/paper/an-introduction-to-artificial-intelligence-1 |
Repo | |
Framework | |
Robust Federated Learning with Noisy Communication
Title | Robust Federated Learning with Noisy Communication |
Authors | Fan Ang, Li Chen, Nan Zhao, Yunfei Chen, Weidong Wang, F. Richard Yu |
Abstract | Federated learning is a communication-efficient training process that alternates between local training at the edge devices and averaging the updated local model at the central server. Nevertheless, it is impractical to achieve a perfect acquisition of the local models in wireless communication due to noise, which also brings serious effects on federated learning. To tackle this challenge, we propose a robust design for federated learning to alleviate the effects of noise in this paper. Considering noise in the two aforementioned steps, we first formulate the training problem as a parallel optimization for each node under the expectation-based model and the worst-case model. Due to the non-convexity of the problem, a regularization for the loss function approximation method is proposed to make it tractable. Regarding the worst-case model, we develop a feasible training scheme which utilizes the sampling-based successive convex approximation algorithm to tackle the unavailable maxima or minima noise condition and the non-convex issue of the objective function. Furthermore, the convergence rates of both new designs are analyzed from a theoretical point of view. Finally, the improvement of prediction accuracy and the reduction of loss function are demonstrated via simulations for the proposed designs. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00251v1 |
https://arxiv.org/pdf/1911.00251v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-federated-learning-with-noisy |
Repo | |
Framework | |
Transfer Learning in Spatial-Temporal Forecasting of the Solar Magnetic Field
Title | Transfer Learning in Spatial-Temporal Forecasting of the Solar Magnetic Field |
Authors | Eurico Covas |
Abstract | Machine learning techniques have been widely used in attempts to forecast several solar datasets. Most of these approaches employ supervised machine learning algorithms which are, in general, very data hungry. This hampers the attempts to forecast some of these data series, particularly the ones that depend on (relatively) recent space observations. Here we focus on an attempt to forecast the solar surface longitudinally averaged radial magnetic field distribution using a form of spatial-temporal neural networks. Given that the recording of these spatial-temporal datasets only started in 1975 and are therefore quite short, the forecasts are predictably quite modest. However, given that there is a potential physical relationship between sunspots and the magnetic field, we employ another machine learning technique called transfer learning which has recently received considerable attention in the literature. Here, this approach consists in first training the source spatial-temporal neural network on the much longer time/latitude sunspot area dataset, which starts in 1874, then transferring the trained set of layers to a target network, and continue training the latter on the magnetic field dataset. The employment of transfer learning in the field of computer vision is known to obtain a generalized set of feature filters that can be reused for other datasets and tasks. Here we obtain a similar result, whereby we first train the network on the spatial-temporal sunspot area data, then the first few layers of the neural network are able to identify the two main features of the solar cycle, i.e. the amplitude variation and the migration to the equator, and therefore can be used to train on the magnetic field dataset and forecast better than a prediction based only on the historical magnetic field data. |
Tasks | Transfer Learning |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03193v1 |
https://arxiv.org/pdf/1911.03193v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-in-spatial-temporal |
Repo | |
Framework | |
Corn leaf detection using Region based convolutional neural network
Title | Corn leaf detection using Region based convolutional neural network |
Authors | Mohammad Ibrahim Sarker, Heechan Yang, Hyongsuk Kim |
Abstract | The field of machine learning has become an increasingly budding area of research as more efficient methods are needed in the quest to handle more complex image detection challenges. To solve the problems of agriculture is more and more important because food is the fundamental of life. However, the detection accuracy in recent corn field systems are still far away from the demands in practice due to a number of different weeds. This paper presents a model to handle the problem of corn leaf detection in given digital images collected from farm field. Based on results of experiments conducted with several state-of-the-art models adopted by CNN, a region-based method has been proposed as a faster and more accurate method of corn leaf detection. Being motivated with such unique attributes of ResNet, we combine it with region based network (such as faster rcnn), which is able to automatically detect corn leaf in heavy weeds occlusion. The method is evaluated on the dataset from farm and we make an annotation ourselves. Our proposed method achieves significantly outperform in corn detection system. |
Tasks | |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01900v1 |
https://arxiv.org/pdf/1906.01900v1.pdf | |
PWC | https://paperswithcode.com/paper/corn-leaf-detection-using-region-based |
Repo | |
Framework | |
Towards the Automation of Deep Image Prior
Title | Towards the Automation of Deep Image Prior |
Authors | Qianwei Zhou, Chen Zhou, Haigen Hu, Yuhang Chen, Shengyong Chen, Xiaoxin Li |
Abstract | Single image inverse problem is a notoriously challenging ill-posed problem that aims to restore the original image from one of its corrupted versions. Recently, this field has been immensely influenced by the emergence of deep-learning techniques. Deep Image Prior (DIP) offers a new approach that forces the recovered image to be synthesized from a given deep architecture. While DIP is quite an effective unsupervised approach, it is deprecated in real-world applications because of the requirement of human assistance. In this work, we aim to find the best-recovered image without the assistance of humans by adding a stopping criterion, which will reach maximum when the iteration no longer improves the image quality. More specifically, we propose to add a pseudo noise to the corrupted image and measure the pseudo-noise component in the recovered image by the orthogonality between signal and noise. The accuracy of the orthogonal stopping criterion has been demonstrated for several tested problems such as denoising, super-resolution, and inpainting, in which 38 out of 40 experiments are higher than 95%. |
Tasks | Denoising, Super-Resolution |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07185v1 |
https://arxiv.org/pdf/1911.07185v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-the-automation-of-deep-image-prior |
Repo | |
Framework | |
A Crowdsourcing Framework for On-Device Federated Learning
Title | A Crowdsourcing Framework for On-Device Federated Learning |
Authors | Shashi Raj Pandey, Nguyen H. Tran, Mehdi Bennis, Yan Kyaw Tun, Aunas Manzoor, Choong Seon Hong |
Abstract | Federated learning (FL) rests on the notion of training a global model in a decentralized manner. Under this setting, mobile devices perform computations on their local data before uploading the required updates to improve the global model. However, when the participating clients implement an uncoordinated computation strategy, the difficulty is to handle the communication efficiency (i.e., the number of communications per iteration) while exchanging the model parameters during aggregation. Therefore, a key challenge in FL is how users participate to build a high-quality global model with communication efficiency. We tackle this issue by formulating a utility maximization problem, and propose a novel crowdsourcing framework to leverage FL that considers the communication efficiency during parameters exchange. First, we show an incentive-based interaction between the crowdsourcing platform and the participating client’s independent strategies for training a global learning model, where each side maximizes its own benefit. We formulate a two-stage Stackelberg game to analyze such scenario and find the game’s equilibria. Second, we formalize an admission control scheme for participating clients to ensure a level of local accuracy. Simulated results demonstrate the efficacy of our proposed solution with up to 22% gain in the offered reward. |
Tasks | |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01046v2 |
https://arxiv.org/pdf/1911.01046v2.pdf | |
PWC | https://paperswithcode.com/paper/a-crowdsourcing-framework-for-on-device |
Repo | |
Framework | |
ViSTRA2: Video Coding using Spatial Resolution and Effective Bit Depth Adaptation
Title | ViSTRA2: Video Coding using Spatial Resolution and Effective Bit Depth Adaptation |
Authors | Fan Zhang, Mariana Afonso, David R. Bull |
Abstract | We present a new video compression framework (ViSTRA2) which exploits adaptation of spatial resolution and effective bit depth, down-sampling these parameters at the encoder based on perceptual criteria, and up-sampling at the decoder using a deep convolution neural network. ViSTRA2 has been integrated with the reference software of both the HEVC (HM 16.20) and VVC (VTM 4.01), and evaluated under the Joint Video Exploration Team Common Test Conditions using the Random Access configuration. Our results show consistent and significant compression gains against HM and VVC based on Bj{\o}negaard Delta measurements, with average BD-rate savings of 12.6% (PSNR) and 19.5% (VMAF) over HM and 5.5% (PSNR) and 8.6% (VMAF) over VTM. |
Tasks | Video Compression |
Published | 2019-11-07 |
URL | https://arxiv.org/abs/1911.02833v1 |
https://arxiv.org/pdf/1911.02833v1.pdf | |
PWC | https://paperswithcode.com/paper/vistra2-video-coding-using-spatial-resolution |
Repo | |
Framework | |
MemeFaceGenerator: Adversarial Synthesis of Chinese Meme-face from Natural Sentences
Title | MemeFaceGenerator: Adversarial Synthesis of Chinese Meme-face from Natural Sentences |
Authors | Yifu Chen, Zongsheng Wang, Bowen Wu, Mengyuan Li, Huan Zhang, Lin Ma, Feng Liu, Qihang Feng, Baoxun Wang |
Abstract | Chinese meme-face is a special kind of internet subculture widely spread in Chinese Social Community Networks. It usually consists of a template image modified by some amusing details and a text caption. In this paper, we present MemeFaceGenerator, a Generative Adversarial Network with the attention module and template information as supplementary signals, to automatically generate meme-faces from text inputs. We also develop a web service as system demonstration of meme-face synthesis. MemeFaceGenerator has been shown to be capable of generating high-quality meme-faces from random text inputs. |
Tasks | Face Generation |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05138v1 |
https://arxiv.org/pdf/1908.05138v1.pdf | |
PWC | https://paperswithcode.com/paper/memefacegenerator-adversarial-synthesis-of |
Repo | |
Framework | |
Unsupervised Learning for Real-World Super-Resolution
Title | Unsupervised Learning for Real-World Super-Resolution |
Authors | Andreas Lugmayr, Martin Danelljan, Radu Timofte |
Abstract | Most current super-resolution methods rely on low and high resolution image pairs to train a network in a fully supervised manner. However, such image pairs are not available in real-world applications. Instead of directly addressing this problem, most works employ the popular bicubic downsampling strategy to artificially generate a corresponding low resolution image. Unfortunately, this strategy introduces significant artifacts, removing natural sensor noise and other real-world characteristics. Super-resolution networks trained on such bicubic images therefore struggle to generalize to natural images. In this work, we propose an unsupervised approach for image super-resolution. Given only unpaired data, we learn to invert the effects of bicubic downsampling in order to restore the natural image characteristics present in the data. This allows us to generate realistic image pairs, faithfully reflecting the distribution of real-world images. Our super-resolution network can therefore be trained with direct pixel-wise supervision in the high resolution domain, while robustly generalizing to real input. We demonstrate the effectiveness of our approach in quantitative and qualitative experiments. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09629v1 |
https://arxiv.org/pdf/1909.09629v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-for-real-world-super |
Repo | |
Framework | |
Learning a manifold from a teacher’s demonstrations
Title | Learning a manifold from a teacher’s demonstrations |
Authors | Pei Wang, Arash Givchi, Patrick Shafto |
Abstract | We consider the problem of learning a manifold. Existing approaches learn by approximating the manifold directly or the topology, but require large amounts of data to overcome challenges posed by manifolds with small reach and non-uniform sampling. We consider contexts where some data could be marked by a teacher. We consider the problem of learning from a perfectly knowledgeable teacher, providing bounds on the sample complexity for learning the manifold exactly and contrast with learning only up to topology. We then consider learning from a teacher with partial knowledge, in which a Topological Data Analysis learner’s inference integrates observations with demonstrations provided by the teacher. Examples on simulated and real data illustrate how teaching can facilitate learning the topology and geometry of the manifold. |
Tasks | Topological Data Analysis |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04615v2 |
https://arxiv.org/pdf/1910.04615v2.pdf | |
PWC | https://paperswithcode.com/paper/manifold-learning-from-a-teachers |
Repo | |
Framework | |
Non-Cooperative Inverse Reinforcement Learning
Title | Non-Cooperative Inverse Reinforcement Learning |
Authors | Xiangyuan Zhang, Kaiqing Zhang, Erik Miehling, Tamer Başar |
Abstract | Making decisions in the presence of a strategic opponent requires one to take into account the opponent’s ability to actively mask its intended objective. To describe such strategic situations, we introduce the non-cooperative inverse reinforcement learning (N-CIRL) formalism. The N-CIRL formalism consists of two agents with completely misaligned objectives, where only one of the agents knows the true objective function. Formally, we model the N-CIRL formalism as a zero-sum Markov game with one-sided incomplete information. Through interacting with the more informed player, the less informed player attempts to both infer, and act according to, the true objective function. As a result of the one-sided incomplete information, the multi-stage game can be decomposed into a sequence of single-stage games expressed by a recursive formula. Solving this recursive formula yields the value of the N-CIRL game and the more informed player’s equilibrium strategy. Another recursive formula, constructed by forming an auxiliary game, termed the dual game, yields the less informed player’s strategy. Building upon these two recursive formulas, we develop a computationally tractable algorithm to approximately solve for the equilibrium strategies. Finally, we demonstrate the benefits of our N-CIRL formalism over the existing multi-agent IRL formalism via extensive numerical simulation in a novel cyber security setting. |
Tasks | |
Published | 2019-11-03 |
URL | https://arxiv.org/abs/1911.04220v2 |
https://arxiv.org/pdf/1911.04220v2.pdf | |
PWC | https://paperswithcode.com/paper/non-cooperative-inverse-reinforcement-1 |
Repo | |
Framework | |
Cardiac MRI Image Segmentation for Left Ventricle and Right Ventricle using Deep Learning
Title | Cardiac MRI Image Segmentation for Left Ventricle and Right Ventricle using Deep Learning |
Authors | Bosung Seo, Daniel Mariano, John Beckfield, Vinay Madenur, Yuming Hu, Tony Reina, Marcus Bobar, Mai H. Nguyen, Ilkay Altintas |
Abstract | The goal of this project is to use magnetic resonance imaging (MRI) data to provide an end-to-end analytics pipeline for left and right ventricle (LV and RV) segmentation. Another aim of the project is to find a model that would be generalizable across medical imaging datasets. We utilized a variety of models, datasets, and tests to determine which one is well suited to this purpose. Specifically, we implemented three models (2-D U-Net, 3-D U-Net, and DenseNet), and evaluated them on four datasets (Automated Cardiac Diagnosis Challenge, MICCAI 2009 LV, Sunnybrook Cardiac Data, MICCAI 2012 RV). While maintaining a consistent preprocessing strategy, we tested the performance of each model when trained on data from the same dataset as the test data, and when trained on data from a different dataset than the test dataset. Data augmentation was also used to increase the adaptability of the models. The results were compared to determine performance and generalizability. |
Tasks | Data Augmentation, Semantic Segmentation |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.08028v1 |
https://arxiv.org/pdf/1909.08028v1.pdf | |
PWC | https://paperswithcode.com/paper/cardiac-mri-image-segmentation-for-left |
Repo | |
Framework | |
Joint Learning of Word and Label Embeddings for Sequence Labelling in Spoken Language Understanding
Title | Joint Learning of Word and Label Embeddings for Sequence Labelling in Spoken Language Understanding |
Authors | Jiewen Wu, Luis Fernando D’Haro, Nancy F. Chen, Pavitra Krishnaswamy, Rafael E. Banchs |
Abstract | We propose an architecture to jointly learn word and label embeddings for slot filling in spoken language understanding. The proposed approach encodes labels using a combination of word embeddings and straightforward word-label association from the training data. Compared to the state-of-the-art methods, our approach does not require label embeddings as part of the input and therefore lends itself nicely to a wide range of model architectures. In addition, our architecture computes contextual distances between words and labels to avoid adding contextual windows, thus reducing memory footprint. We validate the approach on established spoken dialogue datasets and show that it can achieve state-of-the-art performance with much fewer trainable parameters. |
Tasks | Slot Filling, Spoken Language Understanding, Word Embeddings |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07150v1 |
https://arxiv.org/pdf/1910.07150v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-learning-of-word-and-label-embeddings |
Repo | |
Framework | |
Detecting Clues for Skill Levels and Machine Operation Difficulty from Egocentric Vision
Title | Detecting Clues for Skill Levels and Machine Operation Difficulty from Egocentric Vision |
Authors | Longfei Chen, Yuichi Nakamura, Kazuaki Kondo |
Abstract | With respect to machine operation tasks, the experiences from different skill level operators, especially novices, can provide worthy understanding about the manner in which they perceive the operational environment and formulate knowledge to deal with various operation situations. In this study, we describe the operator’s behaviors by utilizing the relations among their head, hand, and operation location (hotspot) during the operation. A total of 40 experiences associated with a sewing machine operation task performed by amateur operators was recorded via a head-mounted RGB-D camera. We examined important features of operational behaviors in different skill level operators and confirmed their correlation to the difficulties of the operation steps. The result shows that the pure-gazing behavior is significantly reduced when the operator’s skill improved. Moreover, the hand-approaching duration and the frequency of attention movement before operation are strongly correlated to the operational difficulty in such machine operating environments. |
Tasks | |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.04002v1 |
https://arxiv.org/pdf/1906.04002v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-clues-for-skill-levels-and-machine |
Repo | |
Framework | |