Paper Group ANR 1335
REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning. Full-stack Optimization for Accelerating CNNs with FPGA Validation. Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry. Deep Network for Capacitive ECG Denoising. Multi-agent Reinforcement Learning Embedded Game for the Optimization of Bui …
REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning
Title | REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning |
Authors | Ming Jiang, Junjie Hu, Qiuyuan Huang, Lei Zhang, Jana Diesner, Jianfeng Gao |
Abstract | Popular metrics used for evaluating image captioning systems, such as BLEU and CIDEr, provide a single score to gauge the system’s overall effectiveness. This score is often not informative enough to indicate what specific errors are made by a given system. In this study, we present a fine-grained evaluation method REO for automatically measuring the performance of image captioning systems. REO assesses the quality of captions from three perspectives: 1) Relevance to the ground truth, 2) Extraness of the content that is irrelevant to the ground truth, and 3) Omission of the elements in the images and human references. Experiments on three benchmark datasets demonstrate that our method achieves a higher consistency with human judgments and provides more intuitive evaluation results than alternative metrics. |
Tasks | Image Captioning |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02217v1 |
https://arxiv.org/pdf/1909.02217v1.pdf | |
PWC | https://paperswithcode.com/paper/reo-relevance-extraness-omission-a-fine |
Repo | |
Framework | |
Full-stack Optimization for Accelerating CNNs with FPGA Validation
Title | Full-stack Optimization for Accelerating CNNs with FPGA Validation |
Authors | Bradley McDanel, Sai Qian Zhang, H. T. Kung, Xin Dong |
Abstract | We present a full-stack optimization framework for accelerating inference of CNNs (Convolutional Neural Networks) and validate the approach with field-programmable gate arrays (FPGA) implementations. By jointly optimizing CNN models, computing architectures, and hardware implementations, our full-stack approach achieves unprecedented performance in the trade-off space characterized by inference latency, energy efficiency, hardware utilization and inference accuracy. As a validation vehicle, we have implemented a 170MHz FPGA inference chip achieving 2.28ms latency for the ImageNet benchmark. The achieved latency is among the lowest reported in the literature while achieving comparable accuracy. However, our chip shines in that it has 9x higher energy efficiency compared to other implementations achieving comparable latency. A highlight of our full-stack approach which attributes to the achieved high energy efficiency is an efficient Selector-Accumulator (SAC) architecture for implementing the multiplier-accumulator (MAC) operation present in any digital CNN hardware. For instance, compared to a FPGA implementation for a traditional 8-bit MAC, SAC substantially reduces required hardware resources (4.85x fewer Look-up Tables) and power consumption (2.48x). |
Tasks | |
Published | 2019-05-01 |
URL | http://arxiv.org/abs/1905.00462v1 |
http://arxiv.org/pdf/1905.00462v1.pdf | |
PWC | https://paperswithcode.com/paper/full-stack-optimization-for-accelerating-cnns |
Repo | |
Framework | |
Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry
Title | Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry |
Authors | Fei Xue, Xin Wang, Shunkai Li, Qiuyuan Wang, Junqiu Wang, Hongbin Zha |
Abstract | Most previous learning-based visual odometry (VO) methods take VO as a pure tracking problem. In contrast, we present a VO framework by incorporating two additional components called Memory and Refining. The Memory component preserves global information by employing an adaptive and efficient selection strategy. The Refining component ameliorates previous results with the contexts stored in the Memory by adopting a spatial-temporal attention mechanism for feature distilling. Experiments on the KITTI and TUM-RGBD benchmark datasets demonstrate that our method outperforms state-of-the-art learning-based methods by a large margin and produces competitive results against classic monocular VO approaches. Especially, our model achieves outstanding performance in challenging scenarios such as texture-less regions and abrupt motions, where classic VO algorithms tend to fail. |
Tasks | Visual Odometry |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.01892v2 |
http://arxiv.org/pdf/1904.01892v2.pdf | |
PWC | https://paperswithcode.com/paper/beyond-tracking-selecting-memory-and-refining |
Repo | |
Framework | |
Deep Network for Capacitive ECG Denoising
Title | Deep Network for Capacitive ECG Denoising |
Authors | Vignesh Ravichandran, Balamurali Murugesan, Sharath M Shankaranarayana, Keerthi Ram, Preejith S. P, Jayaraj Joseph, Mohanasankar Sivaprakasam |
Abstract | Continuous monitoring of cardiac health under free living condition is crucial to provide effective care for patients undergoing post operative recovery and individuals with high cardiac risk like the elderly. Capacitive Electrocardiogram (cECG) is one such technology which allows comfortable and long term monitoring through its ability to measure biopotential in conditions without having skin contact. cECG monitoring can be done using many household objects like chairs, beds and even car seats allowing for seamless monitoring of individuals. This method is unfortunately highly susceptible to motion artifacts which greatly limits its usage in clinical practice. The current use of cECG systems has been limited to performing rhythmic analysis. In this paper we propose a novel end-to-end deep learning architecture to perform the task of denoising capacitive ECG. The proposed network is trained using motion corrupted three channel cECG and a reference LEAD I ECG collected on individuals while driving a car. Further, we also propose a novel joint loss function to apply loss on both signal and frequency domain. We conduct extensive rhythmic analysis on the model predictions and the ground truth. We further evaluate the signal denoising using Mean Square Error(MSE) and Cross Correlation between model predictions and ground truth. We report MSE of 0.167 and Cross Correlation of 0.476. The reported results highlight the feasibility of performing morphological analysis using the filtered cECG. The proposed approach can allow for continuous and comprehensive monitoring of the individuals in free living conditions. |
Tasks | Denoising, ECG Denoising, Electrocardiography (ECG), Morphological Analysis |
Published | 2019-03-29 |
URL | http://arxiv.org/abs/1903.12536v1 |
http://arxiv.org/pdf/1903.12536v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-network-for-capacitive-ecg-denoising |
Repo | |
Framework | |
Multi-agent Reinforcement Learning Embedded Game for the Optimization of Building Energy Control and Power System Planning
Title | Multi-agent Reinforcement Learning Embedded Game for the Optimization of Building Energy Control and Power System Planning |
Authors | Jun Hao |
Abstract | Most of the current game-theoretic demand-side management methods focus primarily on the scheduling of home appliances, and the related numerical experiments are analyzed under various scenarios to achieve the corresponding Nash-equilibrium (NE) and optimal results. However, not much work is conducted for academic or commercial buildings. The methods for optimizing academic-buildings are distinct from the optimal methods for home appliances. In my study, we address a novel methodology to control the operation of heating, ventilation, and air conditioning system (HVAC). With the development of Artificial Intelligence and computer technologies, reinforcement learning (RL) can be implemented in multiple realistic scenarios and help people to solve thousands of real-world problems. Reinforcement Learning, which is considered as the art of future AI, builds the bridge between agents and environments through Markov Decision Chain or Neural Network and has seldom been used in power system. The art of RL is that once the simulator for a specific environment is built, the algorithm can keep learning from the environment. Therefore, RL is capable of dealing with constantly changing simulator inputs such as power demand, the condition of power system and outdoor temperature, etc. Compared with the existing distribution power system planning mechanisms and the related game theoretical methodologies, our proposed algorithm can plan and optimize the hourly energy usage, and have the ability to corporate with even shorter time window if needed. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2019-01-17 |
URL | http://arxiv.org/abs/1901.07333v1 |
http://arxiv.org/pdf/1901.07333v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-reinforcement-learning-embedded |
Repo | |
Framework | |
CoCoNet: A Collaborative Convolutional Network
Title | CoCoNet: A Collaborative Convolutional Network |
Authors | Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal |
Abstract | We present an end-to-end CNN architecture for fine-grained visual recognition called Collaborative Convolutional Network (CoCoNet). The network uses a collaborative filter after the convolutional layers to represent an image as an optimal weighted collaboration of features learned from training samples as a whole rather than one at a time. This gives CoCoNet more power to encode the fine-grained nature of the data with limited samples in an end-to-end fashion. We perform a detailed study of the performance with 1-stage and 2-stage transfer learning and different configurations with benchmark architectures like AlexNet and VggNet. The ablation study shows that the proposed method outperforms its constituent parts considerably and consistently. CoCoNet also outperforms the baseline popular deep learning based fine-grained recognition method, namely Bilinear-CNN (BCNN) with statistical significance. Experiments have been performed on the fine-grained species recognition problem, but the method is general enough to be applied to other similar tasks. Lastly, we also introduce a new public dataset for fine-grained species recognition, that of Indian endemic birds and have reported initial results on it. The training metadata and new dataset are available through the corresponding author. |
Tasks | Fine-Grained Visual Recognition, Transfer Learning |
Published | 2019-01-28 |
URL | https://arxiv.org/abs/1901.09886v3 |
https://arxiv.org/pdf/1901.09886v3.pdf | |
PWC | https://paperswithcode.com/paper/coconet-a-collaborative-convolutional-network |
Repo | |
Framework | |
TCD-NPE: A Re-configurable and Efficient Neural Processing Engine, Powered by Novel Temporal-Carry-deferring MACs
Title | TCD-NPE: A Re-configurable and Efficient Neural Processing Engine, Powered by Novel Temporal-Carry-deferring MACs |
Authors | Ali Mirzaeian, Houman Homayoun, Avesta Sasan |
Abstract | In this paper, we first propose the design of Temporal-Carry-deferring MAC (TCD-MAC) and illustrate how our proposed solution can gain significant energy and performance benefit when utilized to process a stream of input data. We then propose using the TCD-MAC to build a reconfigurable, high speed, and low power Neural Processing Engine (TCD-NPE). We, further, propose a novel scheduler that lists the sequence of needed processing events to process an MLP model in the least number of computational rounds in our proposed TCD-NPE. We illustrate that our proposed TCD-NPE significantly outperform similar neural processing solutions that use conventional MACs in terms of both energy consumption and execution time. |
Tasks | |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06458v1 |
https://arxiv.org/pdf/1910.06458v1.pdf | |
PWC | https://paperswithcode.com/paper/tcd-npe-a-re-configurable-and-efficient |
Repo | |
Framework | |
Are Disentangled Representations Helpful for Abstract Visual Reasoning?
Title | Are Disentangled Representations Helpful for Abstract Visual Reasoning? |
Authors | Sjoerd van Steenkiste, Francesco Locatello, Jürgen Schmidhuber, Olivier Bachem |
Abstract | A disentangled representation encodes information about the salient factors of variation in the data independently. Although it is often argued that this representational format is useful in learning to solve many real-world down-stream tasks, there is little empirical evidence that supports this claim. In this paper, we conduct a large-scale study that investigates whether disentangled representations are more suitable for abstract reasoning tasks. Using two new tasks similar to Raven’s Progressive Matrices, we evaluate the usefulness of the representations learned by 360 state-of-the-art unsupervised disentanglement models. Based on these representations, we train 3600 abstract reasoning models and observe that disentangled representations do in fact lead to better down-stream performance. In particular, they enable quicker learning using fewer samples. |
Tasks | Visual Reasoning |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12506v3 |
https://arxiv.org/pdf/1905.12506v3.pdf | |
PWC | https://paperswithcode.com/paper/are-disentangled-representations-helpful-for |
Repo | |
Framework | |
Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling
Title | Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling |
Authors | Siyuan Feng, Tan Lee |
Abstract | This research addresses the problem of acoustic modeling of low-resource languages for which transcribed training data is absent. The goal is to learn robust frame-level feature representations that can be used to identify and distinguish subword-level speech units. The proposed feature representations comprise various types of multilingual bottleneck features (BNFs) that are obtained via multi-task learning of deep neural networks (MTL-DNN). One of the key problems is how to acquire high-quality frame labels for untranscribed training data to facilitate supervised DNN training. It is shown that learning of robust BNF representations can be achieved by effectively leveraging transcribed speech data and well-trained automatic speech recognition (ASR) systems from one or more out-of-domain (resource-rich) languages. Out-of-domain ASR systems can be applied to perform speaker adaptation with untranscribed training data of the target language, and to decode the training speech into frame-level labels for DNN training. It is also found that better frame labels can be generated by considering temporal dependency in speech when performing frame clustering. The proposed methods of feature learning are evaluated on the standard task of unsupervised subword modeling in Track 1 of the ZeroSpeech 2017 Challenge. The best performance achieved by our system is $9.7%$ in terms of across-speaker triphone minimal-pair ABX error rate, which is comparable to the best systems reported recently. Lastly, our investigation reveals that the closeness between target languages and out-of-domain languages and the amount of available training data for individual target languages could have significant impact on the goodness of learned features. |
Tasks | Multi-Task Learning, Speech Recognition |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.03538v2 |
https://arxiv.org/pdf/1908.03538v2.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-cross-lingual-speaker-and-phonetic |
Repo | |
Framework | |
Accelerating DNN Training in Wireless Federated Edge Learning System
Title | Accelerating DNN Training in Wireless Federated Edge Learning System |
Authors | Jinke Ren, Guanding Yu, Guangyao Ding |
Abstract | Training task in classical machine learning models, such as deep neural networks (DNN), is generally implemented at the remote computationally-adequate cloud center for centralized learning, which is typically time-consuming and resource-hungry. It also incurs serious privacy issue and long communication latency since massive data are transmitted to the centralized node. To overcome these shortcomings, we consider a newly-emerged framework, namely federated edge learning (FEEL), to aggregate the local learning updates at the edge server instead of users’ raw data. Aiming at accelerating the training process while guaranteeing the learning accuracy, we first define a novel performance evaluation criterion, called learning efficiency and formulate a training acceleration optimization problem in the CPU scenario, where each user device is equipped with CPU. The closed-form expressions for joint batchsize selection and communication resource allocation are developed and some insightful results are also highlighted. Further, we extend our learning framework into the GPU scenario and propose a novel training function to characterize the learning property of general GPU modules. The optimal solution in this case is manifested to have the similar structure as that of the CPU scenario, recommending that our proposed algorithm is applicable in more general systems. Finally, extensive experiments validate our theoretical analysis and demonstrate that our proposal can reduce the training time and improve the learning accuracy simultaneously. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09712v2 |
https://arxiv.org/pdf/1905.09712v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-dnn-training-in-wireless |
Repo | |
Framework | |
Studying Topology of Time Lines Graph leads to an alternative approach to the Newcomb’s Paradox
Title | Studying Topology of Time Lines Graph leads to an alternative approach to the Newcomb’s Paradox |
Authors | Giuseppe Giacopelli |
Abstract | The Newcomb’s paradox is one of the most known paradox in Game Theory about the Oracles. We will define the graph associated to the time lines of the Game. After this Studying its topology and using only the Expected Utility Principle we will formulate a solution of the paradox able to explain all the classical cases. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.09311v1 |
https://arxiv.org/pdf/1910.09311v1.pdf | |
PWC | https://paperswithcode.com/paper/studying-topology-of-time-lines-graph-leads |
Repo | |
Framework | |
Online Allocation and Pricing: Constant Regret via Bellman Inequalities
Title | Online Allocation and Pricing: Constant Regret via Bellman Inequalities |
Authors | Alberto Vera, Siddhartha Banerjee, Itai Gurvich |
Abstract | We develop a framework for designing tractable heuristics for Markov Decision Processes (MDP), and use it to obtain constant regret policies for a variety of online allocation problems, including online packing, budget-constrained probing, dynamic pricing, and online contextual bandits with knapsacks. Our approach is based on adaptively constructing a benchmark for the value function, which we then use to select our actions. The centerpiece of our framework are the Bellman Inequalities, which allow us to create benchmarks which both have access to future information, and also, can violate the one-step optimality equations (i.e., Bellman equations). The flexibility of balancing these allows us to get policies which are both tractable and have strong performance guarantees – in particular, our constant-regret policies only require solving an LP for selecting each action. |
Tasks | Multi-Armed Bandits |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06361v1 |
https://arxiv.org/pdf/1906.06361v1.pdf | |
PWC | https://paperswithcode.com/paper/online-allocation-and-pricing-constant-regret |
Repo | |
Framework | |
Bootstrapping Upper Confidence Bound
Title | Bootstrapping Upper Confidence Bound |
Authors | Botao Hao, Yasin Abbasi-Yadkori, Zheng Wen, Guang Cheng |
Abstract | Upper Confidence Bound (UCB) method is arguably the most celebrated one used in online decision making with partial information feedback. Existing techniques for constructing confidence bounds are typically built upon various concentration inequalities, which thus lead to over-exploration. In this paper, we propose a non-parametric and data-dependent UCB algorithm based on the multiplier bootstrap. To improve its finite sample performance, we further incorporate second-order correction into the above construction. In theory, we derive both problem-dependent and problem-independent regret bounds for multi-armed bandits under a much weaker tail assumption than the standard sub-Gaussianity. Numerical results demonstrate significant regret reductions by our method, in comparison with several baselines in a range of multi-armed and linear bandit problems. |
Tasks | Decision Making, Multi-Armed Bandits |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05247v3 |
https://arxiv.org/pdf/1906.05247v3.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-upper-confidence-bound |
Repo | |
Framework | |
Detecting anthropogenic cloud perturbations with deep learning
Title | Detecting anthropogenic cloud perturbations with deep learning |
Authors | Duncan Watson-Parris, Samuel Sutherland, Matthew Christensen, Anthony Caterini, Dino Sejdinovic, Philip Stier |
Abstract | One of the most pressing questions in climate science is that of the effect of anthropogenic aerosol on the Earth’s energy balance. Aerosols provide the `seeds’ on which cloud droplets form, and changes in the amount of aerosol available to a cloud can change its brightness and other physical properties such as optical thickness and spatial extent. Clouds play a critical role in moderating global temperatures and small perturbations can lead to significant amounts of cooling or warming. Uncertainty in this effect is so large it is not currently known if it is negligible, or provides a large enough cooling to largely negate present-day warming by CO2. This work uses deep convolutional neural networks to look for two particular perturbations in clouds due to anthropogenic aerosol and assess their properties and prevalence, providing valuable insights into their climatic effects. | |
Tasks | |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13061v1 |
https://arxiv.org/pdf/1911.13061v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-anthropogenic-cloud-perturbations |
Repo | |
Framework | |
Understanding Bias in Machine Learning
Title | Understanding Bias in Machine Learning |
Authors | Jindong Gu, Daniela Oelke |
Abstract | Bias is known to be an impediment to fair decisions in many domains such as human resources, the public sector, health care etc. Recently, hope has been expressed that the use of machine learning methods for taking such decisions would diminish or even resolve the problem. At the same time, machine learning experts warn that machine learning models can be biased as well. In this article, our goal is to explain the issue of bias in machine learning from a technical perspective and to illustrate the impact that biased data can have on a machine learning model. To reach such a goal, we develop interactive plots to visualizing the bias learned from synthetic data. |
Tasks | |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.01866v1 |
https://arxiv.org/pdf/1909.01866v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-bias-in-machine-learning |
Repo | |
Framework | |