April 2, 2020

3332 words 16 mins read

Paper Group ANR 356

Extending iLQR method with control delay. The Conditional Entropy Bottleneck. Large-Scale Educational Question Analysis with Partial Variational Auto-encoders. Prediction of Discharge Capacity of Labyrinth Weir with Gene Expression Programming. Convergence of Artificial Intelligence and High Performance Computing on NSF-supported Cyberinfrastructur …

Extending iLQR method with control delay


Title	Extending iLQR method with control delay
Authors	Cheng Ju, Yan Qin, Chunjiang Fu
Abstract	Iterative linear quadradic regulator(iLQR) has become a benchmark method to deal with nonlinear stochastic optimal control problem. However, it does not apply to delay system. In this paper, we extend the iLQR theory and prove new theorem in case of input signal with fixed delay. Which could be beneficial for machine learning or optimal control application to real time robot or human assistive device.
Tasks
Published	2020-02-16
URL	https://arxiv.org/abs/2002.07630v1
PDF	https://arxiv.org/pdf/2002.07630v1.pdf
PWC	https://paperswithcode.com/paper/extending-ilqr-method-with-control-delay
Repo
Framework

The Conditional Entropy Bottleneck


Title	The Conditional Entropy Bottleneck
Authors	Ian Fischer
Abstract	Much of the field of Machine Learning exhibits a prominent set of failure modes, including vulnerability to adversarial examples, poor out-of-distribution (OoD) detection, miscalibration, and willingness to memorize random labelings of datasets. We characterize these as failures of robust generalization, which extends the traditional measure of generalization as accuracy or related metrics on a held-out set. We hypothesize that these failures to robustly generalize are due to the learning systems retaining too much information about the training data. To test this hypothesis, we propose the Minimum Necessary Information (MNI) criterion for evaluating the quality of a model. In order to train models that perform well with respect to the MNI criterion, we present a new objective function, the Conditional Entropy Bottleneck (CEB), which is closely related to the Information Bottleneck (IB). We experimentally test our hypothesis by comparing the performance of CEB models with deterministic models and Variational Information Bottleneck (VIB) models on a variety of different datasets and robustness challenges. We find strong empirical evidence supporting our hypothesis that MNI models improve on these problems of robust generalization.
Tasks
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05379v1
PDF	https://arxiv.org/pdf/2002.05379v1.pdf
PWC	https://paperswithcode.com/paper/the-conditional-entropy-bottleneck-1
Repo
Framework

Large-Scale Educational Question Analysis with Partial Variational Auto-encoders


Title	Large-Scale Educational Question Analysis with Partial Variational Auto-encoders
Authors	Zichao Wang, Sebastian Tschiatschek, Simon Woodhead, Jose Miguel Hernandez-Lobato, Simon Peyton Jones, Cheng Zhang
Abstract	Online education platforms enable teachers to share a large number of educational resources such as questions to form exercises and quizzes for students. With large volumes of such crowd-sourced questions, quantifying the properties of these questions in crowd-sourced online education platforms is of great importance to enable both teachers and students to find high-quality and suitable resources. In this work, we propose a framework for large-scale question analysis. We utilize the state-of-the-art Bayesian deep learning method, in particular partial variational auto-encoders, to analyze real-world educational data. We also develop novel objectives to quantify question quality and difficulty. We apply our proposed framework to a real-world cohort with millions of question-answer pairs from an online education platform. Our framework not only demonstrates promising results in terms of statistical metrics but also obtains highly consistent results with domain expert evaluation.
Tasks
Published	2020-03-12
URL	https://arxiv.org/abs/2003.05980v1
PDF	https://arxiv.org/pdf/2003.05980v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-educational-question-analysis
Repo
Framework

Prediction of Discharge Capacity of Labyrinth Weir with Gene Expression Programming


Title	Prediction of Discharge Capacity of Labyrinth Weir with Gene Expression Programming
Authors	Hossein Bonakdari, Isa Ebtehaj, Bahram Gharabaghi, Ali Sharifi, Amir Mosavi
Abstract	This paper proposes a model based on gene expression programming for predicting the discharge coefficient of triangular labyrinth weirs. The parameters influencing discharge coefficient prediction were first examined and presented as crest height ratio to the head over the crest of the weir, a crest length of water to channel width, a crest length of water to the head over the crest of the weir, Froude number and vertex angle dimensionless parameters. Different models were then presented using sensitivity analysis in order to examine each of the dimensionless parameters presented in this study. In addition, an equation was presented through the use of nonlinear regression (NLR) for the purpose of comparison with GEP. The results of the studies conducted by using different statistical indexes indicated that GEP is more capable than NLR. This is to the extent that GEP predicts the discharge coefficient with an average relative error of approximately 2.5% in such a manner that the predicted values have less than 5% relative error in the worst model.
Tasks
Published	2020-01-16
URL	https://arxiv.org/abs/2002.02751v1
PDF	https://arxiv.org/pdf/2002.02751v1.pdf
PWC	https://paperswithcode.com/paper/prediction-of-discharge-capacity-of-labyrinth
Repo
Framework

Convergence of Artificial Intelligence and High Performance Computing on NSF-supported Cyberinfrastructure


Title	Convergence of Artificial Intelligence and High Performance Computing on NSF-supported Cyberinfrastructure
Authors	E. A. Huerta, Asad Khan, Edward Davis, Colleen Bushell, William D. Gropp, Daniel S. Katz, Volodymyr Kindratenko, Seid Koric, William T. C. Kramer, Brendan McGinty, Kenton McHenry, Aaron Saxton
Abstract	Significant investments to upgrade or construct large-scale scientific facilities demand commensurate investments in R&D to design algorithms and computing approaches to enable scientific and engineering breakthroughs in the big data era. The remarkable success of Artificial Intelligence (AI) algorithms to turn big-data challenges in industry and technology into transformational digital solutions that drive a multi-billion dollar industry, which play an ever increasing role shaping human social patterns, has promoted AI as the most sought after signal processing tool in big-data research. As AI continues to evolve into a computing tool endowed with statistical and mathematical rigor, and which encodes domain expertise to inform and inspire AI architectures and optimization algorithms, it has become apparent that single-GPU solutions for training, validation, and testing are no longer sufficient. This realization has been driving the confluence of AI and high performance computing (HPC) to reduce time-to-insight and to produce robust, reliable, trustworthy, and computationally efficient AI solutions. In this white paper, we present a summary of recent developments in this field, and discuss avenues to accelerate and streamline the use of HPC platforms to design accelerated AI algorithms.
Tasks
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08394v1
PDF	https://arxiv.org/pdf/2003.08394v1.pdf
PWC	https://paperswithcode.com/paper/convergence-of-artificial-intelligence-and
Repo
Framework

DeepEMD: Few-Shot Image Classification with Differentiable Earth Mover’s Distance and Structured Classifiers


Title	DeepEMD: Few-Shot Image Classification with Differentiable Earth Mover’s Distance and Structured Classifiers
Authors	Chi Zhang, Yujun Cai, Guosheng Lin, Chunhua Shen
Abstract	In this paper, we address the few-shot classification task from a new perspective of optimal matching between image regions. We adopt the Earth Mover’s Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance. The EMD generates the optimal matching flows between structural elements that have the minimum matching cost, which is used to represent the image distance for classification. To generate the important weights of elements in the EMD formulation, we design a cross-reference mechanism, which can effectively minimize the impact caused by the cluttered background and large intra-class appearance variations. To handle k-shot classification, we propose to learn a structured fully connected layer that can directly classify dense image representations with the EMD. Based on the implicit function theorem, the EMD can be inserted as a layer into the network for end-to-end training. We conduct comprehensive experiments to validate our algorithm and we set new state-of-the-art performance on four popular few-shot classification benchmarks, namely miniImageNet, tieredImageNet, Fewshot-CIFAR100 (FC100) and Caltech-UCSD Birds-200-2011 (CUB).
Tasks	Few-Shot Image Classification, Image Classification
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06777v2
PDF	https://arxiv.org/pdf/2003.06777v2.pdf
PWC	https://paperswithcode.com/paper/deepemd-few-shot-image-classification-with
Repo
Framework

Colored Noise Injection for Training Adversarially Robust Neural Networks


Title	Colored Noise Injection for Training Adversarially Robust Neural Networks
Authors	Evgenii Zheltonozhskii, Chaim Baskin, Yaniv Nemcovsky, Brian Chmiel, Avi Mendelson, Alex M. Bronstein
Abstract	Even though deep learning has shown unmatched performance on various tasks, neural networks have been shown to be vulnerable to small adversarial perturbations of the input that lead to significant performance degradation. In this work we extend the idea of adding white Gaussian noise to the network weights and activations during adversarial training (PNI) to the injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and various previous approaches in terms of adversarial accuracy on CIFAR-10 and CIFAR-100 datasets. In addition, we provide an extensive ablation study of the proposed method justifying the chosen configurations.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.02188v2
PDF	https://arxiv.org/pdf/2003.02188v2.pdf
PWC	https://paperswithcode.com/paper/colored-noise-injection-for-training
Repo
Framework

Towards a Collaborative Approach to Decision Making Based on Ontology and Multi-Agent System Application to crisis management


Title	Towards a Collaborative Approach to Decision Making Based on Ontology and Multi-Agent System Application to crisis management
Authors	Ahmed Maalel, Henda Ben Ghézala
Abstract	The coordination and cooperation of all the stakeholders involved is a decisive point for the control and the resolution of problems. In the insecurity events, the resolution should refer to a plan that defines a general framework of the procedures to be undertaken and the instructions to be complied with; also, a more precise process must be defined by the actors to deal with the case represented by the particular problem of the current situation. Indeed, this process has to cope with a dynamic, unstable and unpredictable environment, due to the heterogeneity and multiplicity of stakeholders, and finally due to their possible geographical distribution. In this article, we will present the first steps of validation of a collaborative decision-making approach in the context of crisis situations such as road accidents. This approach is based on ontologies and multi-agent systems.
Tasks	Decision Making
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07096v1
PDF	https://arxiv.org/pdf/2003.07096v1.pdf
PWC	https://paperswithcode.com/paper/towards-a-collaborative-approach-to-decision
Repo
Framework

Predictive Analysis for Detection of Human Neck Postures using a robust integration of kinetics and kinematics


Title	Predictive Analysis for Detection of Human Neck Postures using a robust integration of kinetics and kinematics
Authors	Korupalli V Rajesh Kumar, Susan Elias
Abstract	Human neck postures and movements need to be monitored, measured, quantified and analyzed, as a preventive measure in healthcare applications. Improper neck postures are an increasing source of neck musculoskeletal disorders, requiring therapy and rehabilitation. The motivation for the research presented in this paper was the need to develop a notification mechanism for improper neck usage. Kinematic data captured by sensors have limitations in accurately classifying the neck postures. Hence, we propose an integrated use of kinematic and kinetic data to efficiently classify neck postures. Using machine learning algorithms we obtained 100% accuracy in the predictive analysis of this data. The research analysis and discussions show that the kinetic data of the Hyoid muscles can accurately detect the neck posture given the corresponding kinematic data captured by the neck-band. The proposed robust platform for the integration of kinematic and kinetic data has enabled the design of a smart neck-band for the prevention of neck musculoskeletal disorders.
Tasks
Published	2020-03-12
URL	https://arxiv.org/abs/2003.06311v1
PDF	https://arxiv.org/pdf/2003.06311v1.pdf
PWC	https://paperswithcode.com/paper/predictive-analysis-for-detection-of-human
Repo
Framework

Annotation-free Learning of Deep Representations for Word Spotting using Synthetic Data and Self Labeling


Title	Annotation-free Learning of Deep Representations for Word Spotting using Synthetic Data and Self Labeling
Authors	Fabian Wolf, Gernot A. Fink
Abstract	Word spotting is a popular tool for supporting the first exploration of historic, handwritten document collections. Today, the best performing methods rely on machine learning techniques, which require a high amount of annotated training material. As training data is usually not available in the application scenario, annotation-free methods aim at solving the retrieval task without representative training samples. In this work, we present an annotation-free method that still employs machine learning techniques and therefore outperforms other learning-free approaches. The weakly supervised training scheme relies on a lexicon, that does not need to precisely fit the dataset. In combination with a confidence based selection of pseudo-labeled training samples, we achieve state-of-the-art query-by-example performances. Furthermore, our method allows to perform query-by-string, which is usually not the case for other annotation-free methods.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.01989v1
PDF	https://arxiv.org/pdf/2003.01989v1.pdf
PWC	https://paperswithcode.com/paper/annotation-free-learning-of-deep
Repo
Framework

Unsupervised Learning Methods for Visual Place Recognition in Discretely and Continuously Changing Environments


Title	Unsupervised Learning Methods for Visual Place Recognition in Discretely and Continuously Changing Environments
Authors	Stefan Schubert, Peer Neubert, Peter Protzel
Abstract	Visual place recognition in changing environments is the problem of finding matchings between two sets of observations, a query set and a reference set, despite severe appearance changes. Recently, image comparison using CNN-based descriptors showed very promising results. However, existing experiments from the literature typically assume a single distinctive condition within each set (e.g., reference: day, query: night). We demonstrate that as soon as the conditions change within one set (e.g., reference: day, query: traversal daytime-dusk-night-dawn), different places under the same condition can suddenly look more similar than same places under different conditions and state-of-the-art approaches like CNN-based descriptors fail. This paper discusses this practically very important problem of in-sequence condition changes and defines a hierarchy of problem setups from (1) no in-sequence changes, (2) discrete in-sequence changes, to (3) continuous in-sequence changes. We will experimentally evaluate the effect of these changes on two state-of-the-art CNN-descriptors. Our experiments emphasize the importance of statistical standardization of descriptors and shows its limitations in case of continuous changes. To address this practically most relevant setup, we investigate and experimentally evaluate the application of unsupervised learning methods using two available PCA-based approaches and propose a novel clustering-based extension of the statistical normalization.
Tasks	Visual Place Recognition
Published	2020-01-24
URL	https://arxiv.org/abs/2001.08960v1
PDF	https://arxiv.org/pdf/2001.08960v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-methods-for-visual
Repo
Framework


Title	Multi-modal Dense Video Captioning
Authors	Vladimir Iashin, Esa Rahtu
Abstract	Dense video captioning is a task of localizing interesting events from an untrimmed video and producing textual description (captions) for each localized event. Most of the previous works in dense video captioning are solely based on visual information and completely ignore the audio track. However, audio, and speech, in particular, are vital cues for a human observer in understanding an environment. In this paper, we present a new dense video captioning approach that is able to utilize any number of modalities for event description. Specifically, we show how audio and speech modalities may improve a dense video captioning model. We apply automatic speech recognition (ASR) system to obtain a temporally aligned textual description of the speech (similar to subtitles) and treat it as a separate input alongside video frames and the corresponding audio track. We formulate the captioning task as a machine translation problem and utilize recently proposed Transformer architecture to convert multi-modal input data into textual descriptions. We demonstrate the performance of our model on ActivityNet Captions dataset. The ablation studies indicate a considerable contribution from audio and speech components suggesting that these modalities contain substantial complementary information to video frames. Furthermore, we provide an in-depth analysis of the ActivityNet Caption results by leveraging the category tags obtained from original YouTube videos. The program code of our method and evaluations will be made publicly available.
Tasks	Dense Video Captioning, Machine Translation, Speech Recognition, Video Captioning
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07758v1
PDF	https://arxiv.org/pdf/2003.07758v1.pdf
PWC	https://paperswithcode.com/paper/multi-modal-dense-video-captioning
Repo
Framework

Robust Self-Supervised Learning of Deterministic Errors in Single-Plane (Monoplanar) and Dual-Plane (Biplanar) X-ray Fluoroscopy


Title	Robust Self-Supervised Learning of Deterministic Errors in Single-Plane (Monoplanar) and Dual-Plane (Biplanar) X-ray Fluoroscopy
Authors	Jacky C. K. Chow, Steven K. Boyd, Derek D. Lichti, Janet L. Ronsky
Abstract	Fluoroscopic imaging that captures X-ray images at video framerates is advantageous for guiding catheter insertions by vascular surgeons and interventional radiologists. Visualizing the dynamical movements non-invasively allows complex surgical procedures to be performed with less trauma to the patient. To improve surgical precision, endovascular procedures can benefit from more accurate fluoroscopy data via calibration. This paper presents a robust self-calibration algorithm suitable for single-plane and dual-plane fluoroscopy. A three-dimensional (3D) target field was imaged by the fluoroscope in a strong geometric network configuration. The unknown 3D positions of targets and the fluoroscope pose were estimated simultaneously by maximizing the likelihood of the Student-t probability distribution function. A smoothed k-nearest neighbour (kNN) regression is then used to model the deterministic component of the image reprojection error of the robust bundle adjustment. The Maximum Likelihood Estimation step and the kNN regression step are then repeated iteratively until convergence. Four different error modeling schemes were compared while varying the quantity of training images. It was found that using a smoothed kNN regression can automatically model the systematic errors in fluoroscopy with similar accuracy as a human expert using a small training dataset. When all training images were used, the 3D mapping error was reduced from 0.61-0.83 mm to 0.04 mm post-calibration (94.2-95.7% improvement), and the 2D reprojection error was reduced from 1.17-1.31 to 0.20-0.21 pixels (83.2-83.8% improvement). When using biplanar fluoroscopy, the 3D measurement accuracy of the system improved from 0.60 mm to 0.32 mm (47.2% improvement).
Tasks	Calibration
Published	2020-01-03
URL	https://arxiv.org/abs/2001.00686v1
PDF	https://arxiv.org/pdf/2001.00686v1.pdf
PWC	https://paperswithcode.com/paper/robust-self-supervised-learning-of
Repo
Framework

Optimal DG allocation and sizing in power system networks using swarm-based algorithms


Title	Optimal DG allocation and sizing in power system networks using swarm-based algorithms
Authors	Kayode Adetunji, Ivan Hofsajer, Ling Cheng
Abstract	Distributed generation (DG) units are power generating plants that are very important to the architecture of present power system networks. The benefit of the addition of these DG units is to increase the power supply to a network. However, the installation of these DG units can cause an adverse effect if not properly allocated and/or sized. Therefore, there is a need to optimally allocate and size them to avoid cases such as voltage instability and expensive investment costs. In this paper, two swarm-based meta-heuristic algorithms, particle swarm optimization (PSO) and whale optimization algorithm (WOA) were developed to solve optimal placement and sizing of DG units in the quest for transmission network planning. A supportive technique, loss sensitivity factors (LSF) was used to identify potential buses for optimal location of DG units. The feasibility of the algorithms was confirmed on two IEEE bus test systems (14- and 30-bus). Comparison results showed that both algorithms produce good solutions and they outperform each other in different metrics. The WOA real power loss reduction considering techno-economic factors in the IEEE 14-bus and 30-bus test system are 6.14 MW and 10.77 MW, compared to the PSOs’ 6.47 MW and 11.73 MW respectively. The PSO has a more reduced total DG unit size in both bus systems with 133.45 MW and 82.44 MW compared to WOAs’ 152.21 MW and 82.44 MW respectively. The paper unveils the strengths and weaknesses of the PSO and the WOA in the application of optimal sizing of DG units in transmission networks.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08089v1
PDF	https://arxiv.org/pdf/2002.08089v1.pdf
PWC	https://paperswithcode.com/paper/optimal-dg-allocation-and-sizing-in-power
Repo
Framework


Title	Uncovering the Data-Related Limits of Human Reasoning Research: An Analysis based on Recommender Systems
Authors	Nicolas Riesterer, Daniel Brand, Marco Ragni
Abstract	Understanding the fundamentals of human reasoning is central to the development of any system built to closely interact with humans. Cognitive science pursues the goal of modeling human-like intelligence from a theory-driven perspective with a strong focus on explainability. Syllogistic reasoning as one of the core domains of human reasoning research has seen a surge of computational models being developed over the last years. However, recent analyses of models’ predictive performances revealed a stagnation in improvement. We believe that most of the problems encountered in cognitive science are not due to the specific models that have been developed but can be traced back to the peculiarities of behavioral data instead. Therefore, we investigate potential data-related reasons for the problems in human reasoning research by comparing model performances on human and artificially generated datasets. In particular, we apply collaborative filtering recommenders to investigate the adversarial effects of inconsistencies and noise in data and illustrate the potential for data-driven methods in a field of research predominantly concerned with gaining high-level theoretical insight into a domain. Our work (i) provides insight into the levels of noise to be expected from human responses in reasoning data, (ii) uncovers evidence for an upper-bound of performance that is close to being reached urging for an extension of the modeling task, and (iii) introduces the tools and presents initial results to pioneer a new paradigm for investigating and modeling reasoning focusing on predicting responses for individual human reasoners.
Tasks	Recommendation Systems
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05196v1
PDF	https://arxiv.org/pdf/2003.05196v1.pdf
PWC	https://paperswithcode.com/paper/uncovering-the-data-related-limits-of-human
Repo
Framework