Paper Group ANR 528
Learning Discriminative Multilevel Structured Dictionaries for Supervised Image Classification. PIMMS: Permutation Invariant Multi-Modal Segmentation. Expanding a robot’s life: Low power object recognition via FPGA-based DCNN deployment. Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition. Time-Discounting Convolution f …
Learning Discriminative Multilevel Structured Dictionaries for Supervised Image Classification
Title | Learning Discriminative Multilevel Structured Dictionaries for Supervised Image Classification |
Authors | Jeremy Aghaei Mazaheri, Elif Vural, Claude Labit, Christine Guillemot |
Abstract | Sparse representations using overcomplete dictionaries have proved to be a powerful tool in many signal processing applications such as denoising, super-resolution, inpainting, compression or classification. The sparsity of the representation very much depends on how well the dictionary is adapted to the data at hand. In this paper, we propose a method for learning structured multilevel dictionaries with discriminative constraints to make them well suited for the supervised pixelwise classification of images. A multilevel tree-structured discriminative dictionary is learnt for each class, with a learning objective concerning the reconstruction errors of the image patches around the pixels over each class-representative dictionary. After the initial assignment of the class labels to image pixels based on their sparse representations over the learnt dictionaries, the final classification is achieved by smoothing the label image with a graph cut method and an erosion method. Applied to a common set of texture images, our supervised classification method shows competitive results with the state of the art. |
Tasks | Denoising, Image Classification, Super-Resolution |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10497v1 |
http://arxiv.org/pdf/1802.10497v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-discriminative-multilevel-structured |
Repo | |
Framework | |
PIMMS: Permutation Invariant Multi-Modal Segmentation
Title | PIMMS: Permutation Invariant Multi-Modal Segmentation |
Authors | Thomas Varsavsky, Zach Eaton-Rosen, Carole H. Sudre, Parashkev Nachev, M. Jorge Cardoso |
Abstract | In a research context, image acquisition will often involve a pre-defined static protocol and the data will be of high quality. If we are to build applications that work in hospitals without significant operational changes in care delivery, algorithms should be designed to cope with the available data in the best possible way. In a clinical environment, imaging protocols are highly flexible, with MRI sequences commonly missing appropriate sequence labeling (e.g. T1, T2, FLAIR). To this end we introduce PIMMS, a Permutation Invariant Multi-Modal Segmentation technique that is able to perform inference over sets of MRI scans without using modality labels. We present results which show that our convolutional neural network can, in some settings, outperform a baseline model which utilizes modality labels, and achieve comparable performance otherwise. |
Tasks | |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06537v1 |
http://arxiv.org/pdf/1807.06537v1.pdf | |
PWC | https://paperswithcode.com/paper/pimms-permutation-invariant-multi-modal |
Repo | |
Framework | |
Expanding a robot’s life: Low power object recognition via FPGA-based DCNN deployment
Title | Expanding a robot’s life: Low power object recognition via FPGA-based DCNN deployment |
Authors | Panagiotis G. Mousouliotis, Konstantinos L. Panayiotou, Emmanouil G. Tsardoulias, Loukas P. Petrou, Andreas L. Symeonidis |
Abstract | FPGAs are commonly used to accelerate domain-specific algorithmic implementations, as they can achieve impressive performance boosts, are reprogrammable and exhibit minimal power consumption. In this work, the SqueezeNet DCNN is accelerated using an SoC FPGA in order for the offered object recognition resource to be employed in a robotic application. Experiments are conducted to investigate the performance and power consumption of the implementation in comparison to deployment on other widely-used computational systems. |
Tasks | Object Recognition |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1804.00512v1 |
http://arxiv.org/pdf/1804.00512v1.pdf | |
PWC | https://paperswithcode.com/paper/expanding-a-robots-life-low-power-object |
Repo | |
Framework | |
Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition
Title | Hierarchical Long Short-Term Concurrent Memory for Human Interaction Recognition |
Authors | Xiangbo Shu, Jinhui Tang, Guo-Jun Qi, Wei Liu, Jian Yang |
Abstract | In this paper, we aim to address the problem of human interaction recognition in videos by exploring the long-term inter-related dynamics among multiple persons. Recently, Long Short-Term Memory (LSTM) has become a popular choice to model individual dynamic for single-person action recognition due to its ability of capturing the temporal motion information in a range. However, existing RNN models focus only on capturing the dynamics of human interaction by simply combining all dynamics of individuals or modeling them as a whole. Such models neglect the inter-related dynamics of how human interactions change over time. To this end, we propose a novel Hierarchical Long Short-Term Concurrent Memory (H-LSTCM) to model the long-term inter-related dynamics among a group of persons for recognizing the human interactions. Specifically, we first feed each person’s static features into a Single-Person LSTM to learn the single-person dynamic. Subsequently, the outputs of all Single-Person LSTM units are fed into a novel Concurrent LSTM (Co-LSTM) unit, which mainly consists of multiple sub-memory units, a new cell gate and a new co-memory cell. In a Co-LSTM unit, each sub-memory unit stores individual motion information, while this Co-LSTM unit selectively integrates and stores inter-related motion information between multiple interacting persons from multiple sub-memory units via the cell gate and co-memory cell, respectively. Extensive experiments on four public datasets validate the effectiveness of the proposed H-LSTCM by comparing against baseline and state-of-the-art methods. |
Tasks | Human Interaction Recognition, Temporal Action Localization |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00270v1 |
http://arxiv.org/pdf/1811.00270v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-long-short-term-concurrent |
Repo | |
Framework | |
Time-Discounting Convolution for Event Sequences with Ambiguous Timestamps
Title | Time-Discounting Convolution for Event Sequences with Ambiguous Timestamps |
Authors | Takayuki Katsuki, Takayuki Osogami, Akira Koseki, Masaki Ono, Michiharu Kudo, Masaki Makino, Atsushi Suzuki |
Abstract | This paper proposes a method for modeling event sequences with ambiguous timestamps, a time-discounting convolution. Unlike in ordinary time series, time intervals are not constant, small time-shifts have no significant effect, and inputting timestamps or time durations into a model is not effective. The criteria that we require for the modeling are providing robustness against time-shifts or timestamps uncertainty as well as maintaining the essential capabilities of time-series models, i.e., forgetting meaningless past information and handling infinite sequences. The proposed method handles them with a convolutional mechanism across time with specific parameterizations, which efficiently represents the event dependencies in a time-shift invariant manner while discounting the effect of past events, and a dynamic pooling mechanism, which provides robustness against the uncertainty in timestamps and enhances the time-discounting capability by dynamically changing the pooling window size. In our learning algorithm, the decaying and dynamic pooling mechanisms play critical roles in handling infinite and variable length sequences. Numerical experiments on real-world event sequences with ambiguous timestamps and ordinary time series demonstrated the advantages of our method. |
Tasks | Time Series |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02395v1 |
http://arxiv.org/pdf/1812.02395v1.pdf | |
PWC | https://paperswithcode.com/paper/time-discounting-convolution-for-event |
Repo | |
Framework | |
Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network
Title | Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network |
Authors | Hai X. Pham, Yuting Wang, Vladimir Pavlovic |
Abstract | This paper presents Generative Adversarial Talking Head (GATH), a novel deep generative neural network that enables fully automatic facial expression synthesis of an arbitrary portrait with continuous action unit (AU) coefficients. Specifically, our model directly manipulates image pixels to make the unseen subject in the still photo express various emotions controlled by values of facial AU coefficients, while maintaining her personal characteristics, such as facial geometry, skin color and hair style, as well as the original surrounding background. In contrast to prior work, GATH is purely data-driven and it requires neither a statistical face model nor image processing tricks to enact facial deformations. Additionally, our model is trained from unpaired data, where the input image, with its auxiliary identity label taken from abundance of still photos in the wild, and the target frame are from different persons. In order to effectively learn such model, we propose a novel weakly supervised adversarial learning framework that consists of a generator, a discriminator, a classifier and an action unit estimator. Our work gives rise to template-and-target-free expression editing, where still faces can be effortlessly animated with arbitrary AU coefficients provided by the user. |
Tasks | |
Published | 2018-03-21 |
URL | http://arxiv.org/abs/1803.07716v2 |
http://arxiv.org/pdf/1803.07716v2.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-talking-head-bringing |
Repo | |
Framework | |
Cross-Subject Transfer Learning Improves the Practicality of Real-World Applications of Brain-Computer Interfaces
Title | Cross-Subject Transfer Learning Improves the Practicality of Real-World Applications of Brain-Computer Interfaces |
Authors | Kuan-Jung Chiang, Chun-Shu Wei, Masaki Nakanishi, Tzyy-Ping Jung |
Abstract | Steady-state visual evoked potential (SSVEP)-based brain-computer interfaces (BCIs) have shown its robustness in facilitating high-efficiency communication. State-of-the-art training-based SSVEP decoding methods such as extended Canonical Correlation Analysis (CCA) and Task-Related Component Analysis (TRCA) are the major players that elevate the efficiency of the SSVEP-based BCIs through a calibration process. However, due to notable human variability across individuals and within individuals over time, calibration (training) data collection is non-negligible and often laborious and time-consuming, deteriorating the practicality of SSVEP BCIs in a real-world context. This study aims to develop a cross-subject transferring approach to reduce the need for collecting training data from a test user with a newly proposed least-squares transformation (LST) method. Study results show the capability of the LST in reducing the number of training templates required for a 40-class SSVEP BCI. The LST method may lead to numerous real-world applications using near-zero-training/plug-and-play high-speed SSVEP BCIs. |
Tasks | Calibration, Transfer Learning |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.02842v4 |
http://arxiv.org/pdf/1810.02842v4.pdf | |
PWC | https://paperswithcode.com/paper/cross-subject-transfer-learning-improves-the |
Repo | |
Framework | |
Automated software vulnerability detection with machine learning
Title | Automated software vulnerability detection with machine learning |
Authors | Jacob A. Harer, Louis Y. Kim, Rebecca L. Russell, Onur Ozdemir, Leonard R. Kosta, Akshay Rangamani, Lei H. Hamilton, Gabriel I. Centeno, Jonathan R. Key, Paul M. Ellingwood, Erik Antelman, Alan Mackay, Marc W. McConley, Jeffrey M. Opper, Peter Chin, Tomo Lazovich |
Abstract | Thousands of security vulnerabilities are discovered in production software each year, either reported publicly to the Common Vulnerabilities and Exposures database or discovered internally in proprietary code. Vulnerabilities often manifest themselves in subtle ways that are not obvious to code reviewers or the developers themselves. With the wealth of open source code available for analysis, there is an opportunity to learn the patterns of bugs that can lead to security vulnerabilities directly from data. In this paper, we present a data-driven approach to vulnerability detection using machine learning, specifically applied to C and C++ programs. We first compile a large dataset of hundreds of thousands of open-source functions labeled with the outputs of a static analyzer. We then compare methods applied directly to source code with methods applied to artifacts extracted from the build process, finding that source-based models perform better. We also compare the application of deep neural network models with more traditional models such as random forests and find the best performance comes from combining features learned by deep models with tree-based models. Ultimately, our highest performing model achieves an area under the precision-recall curve of 0.49 and an area under the ROC curve of 0.87. |
Tasks | Vulnerability Detection |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1803.04497v2 |
http://arxiv.org/pdf/1803.04497v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-software-vulnerability-detection |
Repo | |
Framework | |
Neural Compatibility Modeling with Attentive Knowledge Distillation
Title | Neural Compatibility Modeling with Attentive Knowledge Distillation |
Authors | Xuemeng Song, Fuli Feng, Xianjing Han, Xin Yang, Wei Liu, Liqiang Nie |
Abstract | Recently, the booming fashion sector and its huge potential benefits have attracted tremendous attention from many research communities. In particular, increasing research efforts have been dedicated to the complementary clothing matching as matching clothes to make a suitable outfit has become a daily headache for many people, especially those who do not have the sense of aesthetics. Thanks to the remarkable success of neural networks in various applications such as image classification and speech recognition, the researchers are enabled to adopt the data-driven learning methods to analyze fashion items. Nevertheless, existing studies overlook the rich valuable knowledge (rules) accumulated in fashion domain, especially the rules regarding clothing matching. Towards this end, in this work, we shed light on complementary clothing matching by integrating the advanced deep neural networks and the rich fashion domain knowledge. Considering that the rules can be fuzzy and different rules may have different confidence levels to different samples, we present a neural compatibility modeling scheme with attentive knowledge distillation based on the teacher-student network scheme. Extensive experiments on the real-world dataset show the superiority of our model over several state-of-the-art baselines. Based upon the comparisons, we observe certain fashion insights that add value to the fashion matching study. As a byproduct, we released the codes, and involved parameters to benefit other researchers. |
Tasks | Image Classification, Speech Recognition |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1805.00313v1 |
http://arxiv.org/pdf/1805.00313v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-compatibility-modeling-with-attentive |
Repo | |
Framework | |
Determining Principal Component Cardinality through the Principle of Minimum Description Length
Title | Determining Principal Component Cardinality through the Principle of Minimum Description Length |
Authors | Ami Tavory |
Abstract | PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown. |
Tasks | Dimensionality Reduction, Model Selection |
Published | 2018-12-31 |
URL | https://arxiv.org/abs/1901.00059v2 |
https://arxiv.org/pdf/1901.00059v2.pdf | |
PWC | https://paperswithcode.com/paper/the-stochastic-complexity-of-principal |
Repo | |
Framework | |
Hyperdimensional Computing Nanosystem
Title | Hyperdimensional Computing Nanosystem |
Authors | Abbas Rahimi, Tony F. Wu, Haitong Li, Jan M. Rabaey, H. -S. Philip Wong, Max M. Shulaker, Subhasish Mitra |
Abstract | One viable solution for continuous reduction in energy-per-operation is to rethink functionality to cope with uncertainty by adopting computational approaches that are inherently robust to uncertainty. It requires a novel look at data representations, associated operations, and circuits, and at materials and substrates that enable them. 3D integrated nanotechnologies combined with novel brain-inspired computational paradigms that support fast learning and fault tolerance could lead the way. Recognizing the very size of the brain’s circuits, hyperdimensional (HD) computing can model neural activity patterns with points in a HD space, that is, with hypervectors as large randomly generated patterns. At its very core, HD computing is about manipulating and comparing these patterns inside memory. Emerging nanotechnologies such as carbon nanotube field effect transistors (CNFETs) and resistive RAM (RRAM), and their monolithic 3D integration offer opportunities for hardware implementations of HD computing through tight integration of logic and memory, energy-efficient computation, and unique device characteristics. We experimentally demonstrate and characterize an end-to-end HD computing nanosystem built using monolithic 3D integration of CNFETs and RRAM. With our nanosystem, we experimentally demonstrate classification of 21 languages with measured accuracy of up to 98% on >20,000 sentences (6.4 million characters), training using one text sample (~100,000 characters) per language, and resilient operation (98% accuracy) despite 78% hardware errors in HD representation (outputs stuck at 0 or 1). By exploiting the unique properties of the underlying nanotechnologies, we show that HD computing, when implemented with monolithic 3D integration, can be up to 420X more energy-efficient while using 25X less area compared to traditional silicon CMOS implementations. |
Tasks | |
Published | 2018-11-23 |
URL | http://arxiv.org/abs/1811.09557v1 |
http://arxiv.org/pdf/1811.09557v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperdimensional-computing-nanosystem |
Repo | |
Framework | |
“I ain’t tellin’ white folks nuthin”: A quantitative exploration of the race-related problem of candour in the WPA slave narratives
Title | “I ain’t tellin’ white folks nuthin”: A quantitative exploration of the race-related problem of candour in the WPA slave narratives |
Authors | Soumya Kambhampati |
Abstract | From 1936-38, the Works Progress Administration interviewed thousands of former slaves about their life experiences. While these interviews are crucial to understanding the “peculiar institution” from the standpoint of the slave himself, issues relating to bias cloud analyses of these interviews. The problem I investigate is the problem of candour in the WPA slave narratives: it is widely held in the historical community that the strict racial caste system of the Deep South compelled black ex-slaves to tell white interviewers what they thought they wanted to hear, suggesting that there was a significant difference candour depending on whether their interviewer was white or black. In this work, I attempt to quantitatively characterise this race-related problem of candour. Prior work has either been of an impressionistic, qualitative nature, or utilised exceedingly simple quantitative methodology. In contrast, I use more sophisticated statistical methods: in particular word frequency and sentiment analysis and comparative topic modelling with LDA to try and identify differences in the content and sentiment expressed by ex-slaves in front of white interviewers versus black interviewers. While my sentiment analysis methodology was ultimately unsuccessful due to the complexity of the task, my word frequency analysis and comparative topic modelling methods both showed strong evidence that the content expressed in front of white interviewers was different from that of black interviewers. In particular, I found that the ex-slaves spoke much more about unfavourable aspects of slavery like whipping and slave patrollers in front of interviewers of their own race. I hope that my more-sophisticated statistical methodology helps improve the robustness of the argument for the existence of this problem of candour in the slave narratives, which some would seek to deny for revisionist purposes. |
Tasks | Sentiment Analysis |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00471v1 |
http://arxiv.org/pdf/1805.00471v1.pdf | |
PWC | https://paperswithcode.com/paper/i-aint-tellin-white-folks-nuthin-a |
Repo | |
Framework | |
A Simple Approach to Intrinsic Correspondence Learning on Unstructured 3D Meshes
Title | A Simple Approach to Intrinsic Correspondence Learning on Unstructured 3D Meshes |
Authors | Isaak Lim, Alexander Dielen, Marcel Campen, Leif Kobbelt |
Abstract | The question of representation of 3D geometry is of vital importance when it comes to leveraging the recent advances in the field of machine learning for geometry processing tasks. For common unstructured surface meshes state-of-the-art methods rely on patch-based or mapping-based techniques that introduce resampling operations in order to encode neighborhood information in a structured and regular manner. We investigate whether such resampling can be avoided, and propose a simple and direct encoding approach. It does not only increase processing efficiency due to its simplicity - its direct nature also avoids any loss in data fidelity. To evaluate the proposed method, we perform a number of experiments in the challenging domain of intrinsic, non-rigid shape correspondence estimation. In comparisons to current methods we observe that our approach is able to achieve highly competitive results. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06664v2 |
http://arxiv.org/pdf/1809.06664v2.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-approach-to-intrinsic-correspondence |
Repo | |
Framework | |
Generic adaptation strategies for automated machine learning
Title | Generic adaptation strategies for automated machine learning |
Authors | Rashid Bakirov, Bogdan Gabrys, Damien Fay |
Abstract | Automation of machine learning model development is increasingly becoming an established research area. While automated model selection and automated data pre-processing have been studied in depth, there is, however, a gap concerning automated model adaptation strategies when multiple strategies are available. Manually developing an adaptation strategy, including estimation of relevant parameters can be time consuming and costly. In this paper we address this issue by proposing generic adaptation strategies based on approaches from earlier works. Experimental results after using the proposed strategies with three adaptive algorithms on 36 datasets confirm their viability. These strategies often achieve better or comparable performance with custom adaptation strategies and naive methods such as repeatedly using only one adaptive mechanism. |
Tasks | Model Selection |
Published | 2018-12-27 |
URL | https://arxiv.org/abs/1812.10793v2 |
https://arxiv.org/pdf/1812.10793v2.pdf | |
PWC | https://paperswithcode.com/paper/generic-adaptation-strategies-for-automated |
Repo | |
Framework | |
Multiple Character Embeddings for Chinese Word Segmentation
Title | Multiple Character Embeddings for Chinese Word Segmentation |
Authors | Jingkang Wang, Jianing Zhou, Jie Zhou, Gongshen Liu |
Abstract | Chinese word segmentation (CWS) is often regarded as a character-based sequence labeling task in most current works which have achieved great success with the help of powerful neural networks. However, these works neglect an important clue: Chinese characters incorporate both semantic and phonetic meanings. In this paper, we introduce multiple character embeddings including Pinyin Romanization and Wubi Input, both of which are easily accessible and effective in depicting semantics of characters. We propose a novel shared Bi-LSTM-CRF model to fuse linguistic features efficiently by sharing the LSTM network during the training procedure. Extensive experiments on five corpora show that extra embeddings help obtain a significant improvement in labeling accuracy. Specifically, we achieve the state-of-the-art performance in AS and CityU corpora with F1 scores of 96.9 and 97.3, respectively without leveraging any external lexical resources. |
Tasks | Chinese Word Segmentation |
Published | 2018-08-15 |
URL | https://arxiv.org/abs/1808.04963v3 |
https://arxiv.org/pdf/1808.04963v3.pdf | |
PWC | https://paperswithcode.com/paper/multiple-character-embeddings-for-chinese |
Repo | |
Framework | |