Paper Group ANR 797
Deep Predictive Coding Network for Object Recognition. An Experimental Evaluation of Covariates Effects on Unconstrained Face Verification. LUCSS: Language-based User-customized Colourization of Scene Sketches. Automatic Processing and Solar Cell Detection in Photovoltaic Electroluminescence Images. Augmenting Recurrent Neural Networks with High-Or …
Deep Predictive Coding Network for Object Recognition
Title | Deep Predictive Coding Network for Object Recognition |
Authors | Haiguang Wen, Kuan Han, Junxing Shi, Yizhen Zhang, Eugenio Culurciello, Zhongming Liu |
Abstract | Based on the predictive coding theory in neuroscience, we designed a bi-directional and recurrent neural net, namely deep predictive coding networks (PCN). It has feedforward, feedback, and recurrent connections. Feedback connections from a higher layer carry the prediction of its lower-layer representation; feedforward connections carry the prediction errors to its higher-layer. Given image input, PCN runs recursive cycles of bottom-up and top-down computation to update its internal representations and reduce the difference between bottom-up input and top-down prediction at every layer. After multiple cycles of recursive updating, the representation is used for image classification. With benchmark data (CIFAR-10/100, SVHN, and MNIST), PCN was found to always outperform its feedforward-only counterpart: a model without any mechanism for recurrent dynamics. Its performance tended to improve given more cycles of computation over time. In short, PCN reuses a single architecture to recursively run bottom-up and top-down processes. As a dynamical system, PCN can be unfolded to a feedforward model that becomes deeper and deeper over time, while refining it representation towards more accurate and definitive object recognition. |
Tasks | Image Classification, Object Recognition |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04762v2 |
http://arxiv.org/pdf/1802.04762v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-predictive-coding-network-for-object |
Repo | |
Framework | |
An Experimental Evaluation of Covariates Effects on Unconstrained Face Verification
Title | An Experimental Evaluation of Covariates Effects on Unconstrained Face Verification |
Authors | Boyu Lu, Jun-Cheng Chen, Carlos D. Castillo, Rama Chellappa |
Abstract | Covariates are factors that have a debilitating influence on face verification performance. In this paper, we comprehensively study two covariate related problems for unconstrained face verification: first, how covariates affect the performance of deep neural networks on the large-scale unconstrained face verification problem; second, how to utilize covariates to improve verification performance. To study the first problem, we implement five state-of-the-art deep convolutional networks (DCNNs) for face verification and evaluate them on three challenging covariates datasets. In total, seven covariates are considered: pose (yaw and roll), age, facial hair, gender, indoor/outdoor, occlusion (nose and mouth visibility, eyes visibility, and forehead visibility), and skin tone. These covariates cover both intrinsic subject-specific characteristics and extrinsic factors of faces. Some of the results confirm and extend the findings of previous studies, others are new findings that were rarely mentioned previously or did not show consistent trends. For the second problem, we demonstrate that with the assistance of gender information, the quality of a pre-curated noisy large-scale face dataset for face recognition can be further improved. After retraining the face recognition model using the curated data, performance improvement is observed at low False Acceptance Rates (FARs) (FAR=$10^{-5}$, $10^{-6}$, $10^{-7}$). |
Tasks | Face Recognition, Face Verification |
Published | 2018-08-16 |
URL | http://arxiv.org/abs/1808.05508v1 |
http://arxiv.org/pdf/1808.05508v1.pdf | |
PWC | https://paperswithcode.com/paper/an-experimental-evaluation-of-covariates |
Repo | |
Framework | |
LUCSS: Language-based User-customized Colourization of Scene Sketches
Title | LUCSS: Language-based User-customized Colourization of Scene Sketches |
Authors | Changqing Zou, Haoran Mo, Ruofei Du, Xing Wu, Chengying Gao, Hongbo Fu |
Abstract | We introduce LUCSS, a language-based system for interactive col- orization of scene sketches, based on their semantic understanding. LUCSS is built upon deep neural networks trained via a large-scale repository of scene sketches and cartoon-style color images with text descriptions. It con- sists of three sequential modules. First, given a scene sketch, the segmenta- tion module automatically partitions an input sketch into individual object instances. Next, the captioning module generates the text description with spatial relationships based on the instance-level segmentation results. Fi- nally, the interactive colorization module allows users to edit the caption and produce colored images based on the altered caption. Our experiments show the effectiveness of our approach and the desirability of its compo- nents to alternative choices. |
Tasks | Colorization |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10544v1 |
http://arxiv.org/pdf/1808.10544v1.pdf | |
PWC | https://paperswithcode.com/paper/lucss-language-based-user-customized |
Repo | |
Framework | |
Automatic Processing and Solar Cell Detection in Photovoltaic Electroluminescence Images
Title | Automatic Processing and Solar Cell Detection in Photovoltaic Electroluminescence Images |
Authors | Evgenii Sovetkin, Ansgar Steland |
Abstract | Electroluminescence (EL) imaging is a powerful and established technique for assessing the quality of photovoltaic (PV) modules, which consist of many electrically connected solar cells arranged in a grid. The analysis of imperfect real-world images requires reliable methods for preprocessing, detection and extraction of the cells. We propose several methods for those tasks, which, however, can be modified to related imaging problems where similar geometric objects need to be detected accurately. Allowing for images taken under difficult outdoor conditions, we present methods to correct for rotation and perspective distortions. The next important step is the extraction of the solar cells of a PV module, for instance to pass them to a procedure to detect and analyze defects on their surface. We propose a method based on specialized Hough transforms, which allows to extract the cells even when the module is surrounded by disturbing background and a fast method based on cumulated sums (CUSUM) change detection to extract the cell area of single-cell mini-module, where the correction of perspective distortion is implicitly done. The methods are highly automatized to allow for big data analyses. Their application to a large database of EL images substantiates that the methods work reliably on a large scale for real-world images. Simulations show that the approach achieves high accuracy, reliability and robustness. This even holds for low contrast images as evaluated by comparing the simulated accuracy for a low and a high contrast image. |
Tasks | |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10820v1 |
http://arxiv.org/pdf/1807.10820v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-processing-and-solar-cell-detection |
Repo | |
Framework | |
Augmenting Recurrent Neural Networks with High-Order User-Contextual Preference for Session-Based Recommendation
Title | Augmenting Recurrent Neural Networks with High-Order User-Contextual Preference for Session-Based Recommendation |
Authors | Younghun Song, Jae-Gil Lee |
Abstract | The recent adoption of recurrent neural networks (RNNs) for session modeling has yielded substantial performance gains compared to previous approaches. In terms of context-aware session modeling, however, the existing RNN-based models are limited in that they are not designed to explicitly model rich static user-side contexts (e.g., age, gender, location). Therefore, in this paper, we explore the utility of explicit user-side context modeling for RNN session models. Specifically, we propose an augmented RNN (ARNN) model that extracts high-order user-contextual preference using the product-based neural network (PNN) in order to augment any existing RNN session model. Evaluation results show that our proposed model outperforms the baseline RNN session model by a large margin when rich user-side contexts are available. |
Tasks | Session-Based Recommendations |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.02983v1 |
http://arxiv.org/pdf/1805.02983v1.pdf | |
PWC | https://paperswithcode.com/paper/augmenting-recurrent-neural-networks-with |
Repo | |
Framework | |
Dynamic Advisor-Based Ensemble (dynABE): Case study in stock trend prediction of critical metal companies
Title | Dynamic Advisor-Based Ensemble (dynABE): Case study in stock trend prediction of critical metal companies |
Authors | Zhengyang Dong |
Abstract | Stock trend prediction is a challenging task due to the market’s noise, and machine learning techniques have recently been successful in coping with this challenge. In this research, we create a novel framework for stock prediction, Dynamic Advisor-Based Ensemble (dynABE). dynABE explores domain-specific areas based on the companies of interest, diversifies the feature set by creating different “advisors” that each handles a different area, follows an effective model ensemble procedure for each advisor, and combines the advisors together in a second-level ensemble through an online update strategy we developed. dynABE is able to adapt to price pattern changes of the market during the active trading period robustly, without needing to retrain the entire model. We test dynABE on three cobalt-related companies, and it achieves the best-case misclassification error of 31.12% and an annualized absolute return of 359.55% with zero maximum drawdown. dynABE also consistently outperforms the baseline models of support vector machine, neural network, and random forest in all case studies. |
Tasks | Stock Prediction, Stock Trend Prediction, Time Series |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.12111v4 |
http://arxiv.org/pdf/1805.12111v4.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-advisor-based-ensemble-dynabe-case |
Repo | |
Framework | |
Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing
Title | Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing |
Authors | Anurag Ajay, Jiajun Wu, Nima Fazeli, Maria Bauza, Leslie P. Kaelbling, Joshua B. Tenenbaum, Alberto Rodriguez |
Abstract | An efficient, generalizable physical simulator with universal uncertainty estimates has wide applications in robot state estimation, planning, and control. In this paper, we build such a simulator for two scenarios, planar pushing and ball bouncing, by augmenting an analytical rigid-body simulator with a neural network that learns to model uncertainty as residuals. Combining symbolic, deterministic simulators with learnable, stochastic neural nets provides us with expressiveness, efficiency, and generalizability simultaneously. Our model outperforms both purely analytical and purely learned simulators consistently on real, standard benchmarks. Compared with methods that model uncertainty using Gaussian processes, our model runs much faster, generalizes better to new object shapes, and is able to characterize the complex distribution of object trajectories. |
Tasks | Gaussian Processes |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03246v1 |
http://arxiv.org/pdf/1808.03246v1.pdf | |
PWC | https://paperswithcode.com/paper/augmenting-physical-simulators-with |
Repo | |
Framework | |
Instant Automated Inference of Perceived Mental Stress through Smartphone PPG and Thermal Imaging
Title | Instant Automated Inference of Perceived Mental Stress through Smartphone PPG and Thermal Imaging |
Authors | Youngjun Cho, Simon J. Julier, Nadia Bianchi-Berthouze |
Abstract | Background: A smartphone is a promising tool for daily cardiovascular measurement and mental stress monitoring. A smartphone camera-based PhotoPlethysmoGraphy (PPG) and a low-cost thermal camera can be used to create cheap, convenient and mobile monitoring systems. However, to ensure reliable monitoring results, a person has to remain still for several minutes while a measurement is being taken. This is very cumbersome and makes its use in real-life mobile situations quite impractical. Objective: We propose a system which combines PPG and thermography with the aim of improving cardiovascular signal quality and capturing stress responses quickly. Methods: Using a smartphone camera with a low cost thermal camera added on, we built a novel system which continuously and reliably measures two different types of cardiovascular events: i) blood volume pulse and ii) vasoconstriction/dilation-induced temperature changes of the nose tip. 17 healthy participants, involved in a series of stress-inducing mental workload tasks, measured their physiological responses to stressors over a short window of time (20 seconds) immediately after each task. Participants reported their level of perceived mental stress using a 10-cm Visual Analogue Scale (VAS). We used normalized K-means clustering to reduce interpersonal differences in the self-reported ratings. For the instant stress inference task, we built novel low-level feature sets representing variability of cardiovascular patterns. We then used the automatic feature learning capability of artificial Neural Networks (NN) to improve the mapping between the extracted set of features and the self-reported ratings. We compared our proposed method with existing hand-engineered features-based machine learning methods. Results, Conclusions: … due to limited space here, we refer to our manuscript. |
Tasks | Photoplethysmography (PPG) |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1901.00449v1 |
http://arxiv.org/pdf/1901.00449v1.pdf | |
PWC | https://paperswithcode.com/paper/instant-automated-inference-of-perceived |
Repo | |
Framework | |
Adversarial Constraint Learning for Structured Prediction
Title | Adversarial Constraint Learning for Structured Prediction |
Authors | Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon |
Abstract | Constraint-based learning reduces the burden of collecting labels by having users specify general properties of structured outputs, such as constraints imposed by physical laws. We propose a novel framework for simultaneously learning these constraints and using them for supervision, bypassing the difficulty of using domain expertise to manually specify constraints. Learning requires a black-box simulator of structured outputs, which generates valid labels, but need not model their corresponding inputs or the input-label relationship. At training time, we constrain the model to produce outputs that cannot be distinguished from simulated labels by adversarial training. Providing our framework with a small number of labeled inputs gives rise to a new semi-supervised structured prediction model; we evaluate this model on multiple tasks — tracking, pose estimation and time series prediction — and find that it achieves high accuracy with only a small number of labeled inputs. In some cases, no labels are required at all. |
Tasks | Pose Estimation, Structured Prediction, Time Series, Time Series Prediction |
Published | 2018-05-27 |
URL | http://arxiv.org/abs/1805.10561v2 |
http://arxiv.org/pdf/1805.10561v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-constraint-learning-for |
Repo | |
Framework | |
Pairwise Relational Networks for Face Recognition
Title | Pairwise Relational Networks for Face Recognition |
Authors | Bong-Nam Kang, Yonghyun Kim, Daijin Kim |
Abstract | Existing face recognition using deep neural networks is difficult to know what kind of features are used to discriminate the identities of face images clearly. To investigate the effective features for face recognition, we propose a novel face recognition method, called a pairwise relational network (PRN), that obtains local appearance patches around landmark points on the feature map, and captures the pairwise relation between a pair of local appearance patches. The PRN is trained to capture unique and discriminative pairwise relations among different identities. Because the existence and meaning of pairwise relations should be identity dependent, we add a face identity state feature, which obtains from the long short-term memory (LSTM) units network with the sequential local appearance patches on the feature maps, to the PRN. To further improve accuracy of face recognition, we combined the global appearance representation with the pairwise relational feature. Experimental results on the LFW show that the PRN using only pairwise relations achieved 99.65% accuracy and the PRN using both pairwise relations and face identity state feature achieved 99.76% accuracy. On the YTF, both the PRN using only pairwise relations and the PRN using pairwise relations and the face identity state feature achieved the state-of-the-art (95.7% and 96.3%). The PRN also achieved comparable results to the state-of-the-art for both face verification and face identification tasks on the IJB-A, and the state-of-the-art on the IJB-B. |
Tasks | Face Identification, Face Recognition, Face Verification |
Published | 2018-08-15 |
URL | http://arxiv.org/abs/1808.04976v1 |
http://arxiv.org/pdf/1808.04976v1.pdf | |
PWC | https://paperswithcode.com/paper/pairwise-relational-networks-for-face |
Repo | |
Framework | |
Distributed Learning of Average Belief Over Networks Using Sequential Observations
Title | Distributed Learning of Average Belief Over Networks Using Sequential Observations |
Authors | Kaiqing Zhang, Yang Liu, Ji Liu, Mingyan Liu, Tamer Başar |
Abstract | This paper addresses the problem of distributed learning of average belief with sequential observations, in which a network of $n>1$ agents aim to reach a consensus on the average value of their beliefs, by exchanging information only with their neighbors. Each agent has sequentially arriving samples of its belief in an online manner. The neighbor relationships among the $n$ agents are described by a graph which is possibly time-varying, whose vertices correspond to agents and whose edges depict neighbor relationships. Two distributed online algorithms are introduced for undirected and directed graphs, which are both shown to converge to the average belief almost surely. Moreover, the sequences generated by both algorithms are shown to reach consensus with an $O(1/t)$ rate with high probability, where $t$ is the number of iterations. For undirected graphs, the corresponding algorithm is modified for the case with quantized communication and limited precision of the division operation. It is shown that the modified algorithm causes all $n$ agents to either reach a quantized consensus or enter a small neighborhood around the average of their beliefs. Numerical simulations are then provided to corroborate the theoretical results. |
Tasks | |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07799v1 |
http://arxiv.org/pdf/1811.07799v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-learning-of-average-belief-over |
Repo | |
Framework | |
Deep Sketch-Photo Face Recognition Assisted by Facial Attributes
Title | Deep Sketch-Photo Face Recognition Assisted by Facial Attributes |
Authors | Seyed Mehdi Iranmanesh, Hadi Kazemi, Sobhan Soleymani, Ali Dabouei, Nasser M. Nasrabadi |
Abstract | In this paper, we present a deep coupled framework to address the problem of matching sketch image against a gallery of mugshots. Face sketches have the essential in- formation about the spatial topology and geometric details of faces while missing some important facial attributes such as ethnicity, hair, eye, and skin color. We propose a cou- pled deep neural network architecture which utilizes facial attributes in order to improve the sketch-photo recognition performance. The proposed Attribute-Assisted Deep Con- volutional Neural Network (AADCNN) method exploits the facial attributes and leverages the loss functions from the facial attributes identification and face verification tasks in order to learn rich discriminative features in a common em- bedding subspace. The facial attribute identification task increases the inter-personal variations by pushing apart the embedded features extracted from individuals with differ- ent facial attributes, while the verification task reduces the intra-personal variations by pulling together all the fea- tures that are related to one person. The learned discrim- inative features can be well generalized to new identities not seen in the training data. The proposed architecture is able to make full use of the sketch and complementary fa- cial attribute information to train a deep model compared to the conventional sketch-photo recognition methods. Exten- sive experiments are performed on composite (E-PRIP) and semi-forensic (IIIT-D semi-forensic) datasets. The results show the superiority of our method compared to the state- of-the-art models in sketch-photo recognition algorithms |
Tasks | Face Recognition, Face Verification |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1808.00059v1 |
http://arxiv.org/pdf/1808.00059v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-sketch-photo-face-recognition-assisted |
Repo | |
Framework | |
Offline EEG-Based Driver Drowsiness Estimation Using Enhanced Batch-Mode Active Learning (EBMAL) for Regression
Title | Offline EEG-Based Driver Drowsiness Estimation Using Enhanced Batch-Mode Active Learning (EBMAL) for Regression |
Authors | Dongrui Wu, Vernon J. Lawhern, Stephen Gordon, Brent J. Lance, Chin-Teng Lin |
Abstract | There are many important regression problems in real-world brain-computer interface (BCI) applications, e.g., driver drowsiness estimation from EEG signals. This paper considers offline analysis: given a pool of unlabeled EEG epochs recorded during driving, how do we optimally select a small number of them to label so that an accurate regression model can be built from them to label the rest? Active learning is a promising solution to this problem, but interestingly, to our best knowledge, it has not been used for regression problems in BCI so far. This paper proposes a novel enhanced batch-mode active learning (EBMAL) approach for regression, which improves upon a baseline active learning algorithm by increasing the reliability, representativeness and diversity of the selected samples to achieve better regression performance. We validate its effectiveness using driver drowsiness estimation from EEG signals. However, EBMAL is a general approach that can also be applied to many other offline regression problems beyond BCI. |
Tasks | Active Learning, EEG |
Published | 2018-05-12 |
URL | http://arxiv.org/abs/1805.04737v1 |
http://arxiv.org/pdf/1805.04737v1.pdf | |
PWC | https://paperswithcode.com/paper/offline-eeg-based-driver-drowsiness |
Repo | |
Framework | |
3D Segmentation with Exponential Logarithmic Loss for Highly Unbalanced Object Sizes
Title | 3D Segmentation with Exponential Logarithmic Loss for Highly Unbalanced Object Sizes |
Authors | Ken C. L. Wong, Mehdi Moradi, Hui Tang, Tanveer Syeda-Mahmood |
Abstract | With the introduction of fully convolutional neural networks, deep learning has raised the benchmark for medical image segmentation on both speed and accuracy, and different networks have been proposed for 2D and 3D segmentation with promising results. Nevertheless, most networks only handle relatively small numbers of labels (<10), and there are very limited works on handling highly unbalanced object sizes especially in 3D segmentation. In this paper, we propose a network architecture and the corresponding loss function which improve segmentation of very small structures. By combining skip connections and deep supervision with respect to the computational feasibility of 3D segmentation, we propose a fast converging and computationally efficient network architecture for accurate segmentation. Furthermore, inspired by the concept of focal loss, we propose an exponential logarithmic loss which balances the labels not only by their relative sizes but also by their segmentation difficulties. We achieve an average Dice coefficient of 82% on brain segmentation with 20 labels, with the ratio of the smallest to largest object sizes as 0.14%. Less than 100 epochs are required to reach such accuracy, and segmenting a 128x128x128 volume only takes around 0.4 s. |
Tasks | Brain Segmentation, Medical Image Segmentation, Semantic Segmentation |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1809.00076v2 |
http://arxiv.org/pdf/1809.00076v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-segmentation-with-exponential-logarithmic |
Repo | |
Framework | |
Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness
Title | Introducing two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness |
Authors | Kim Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu |
Abstract | We present two novel datasets for the low-resource language Vietnamese to assess models of semantic similarity: ViCon comprises pairs of synonyms and antonyms across word classes, thus offering data to distinguish between similarity and dissimilarity. ViSim-400 provides degrees of similarity across five semantic relations, as rated by human judges. The two datasets are verified through standard co-occurrence and neural network models, showing results comparable to the respective English datasets. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2018-04-15 |
URL | http://arxiv.org/abs/1804.05388v2 |
http://arxiv.org/pdf/1804.05388v2.pdf | |
PWC | https://paperswithcode.com/paper/introducing-two-vietnamese-datasets-for |
Repo | |
Framework | |