Paper Group ANR 893
Select, Attend, and Transfer: Light, Learnable Skip Connections. Semi-parametric Topological Memory for Navigation. Practical Algorithms for STV and Ranked Pairs with Parallel Universes Tiebreaking. Visual Saliency Maps Can Apply to Facial Expression Recognition. In-the-wild Facial Expression Recognition in Extreme Poses. Cross-database non-frontal …
Select, Attend, and Transfer: Light, Learnable Skip Connections
Title | Select, Attend, and Transfer: Light, Learnable Skip Connections |
Authors | Saeid Asgari Taghanaki, Aicha Bentaieb, Anmol Sharma, S. Kevin Zhou, Yefeng Zheng, Bogdan Georgescu, Puneet Sharma, Sasa Grbic, Zhoubing Xu, Dorin Comaniciu, Ghassan Hamarneh |
Abstract | Skip connections in deep networks have improved both segmentation and classification performance by facilitating the training of deeper network architectures, and reducing the risks for vanishing gradients. They equip encoder-decoder-like networks with richer feature representations, but at the cost of higher memory usage, computation, and possibly resulting in transferring non-discriminative feature maps. In this paper, we focus on improving skip connections used in segmentation networks (e.g., U-Net, V-Net, and The One Hundred Layers Tiramisu (DensNet) architectures). We propose light, learnable skip connections which learn to first select the most discriminative channels and then attend to the most discriminative regions of the selected feature maps. The output of the proposed skip connections is a unique feature map which not only reduces the memory usage and network parameters to a high extent, but also improves segmentation accuracy. We evaluate the proposed method on three different 2D and volumetric datasets and demonstrate that the proposed light, learnable skip connections can outperform the traditional heavy skip connections in terms of segmentation accuracy, memory usage, and number of network parameters. |
Tasks | |
Published | 2018-04-14 |
URL | http://arxiv.org/abs/1804.05181v3 |
http://arxiv.org/pdf/1804.05181v3.pdf | |
PWC | https://paperswithcode.com/paper/select-attend-and-transfer-light-learnable |
Repo | |
Framework | |
Semi-parametric Topological Memory for Navigation
Title | Semi-parametric Topological Memory for Navigation |
Authors | Nikolay Savinov, Alexey Dosovitskiy, Vladlen Koltun |
Abstract | We introduce a new memory architecture for navigation in previously unseen environments, inspired by landmark-based navigation in animals. The proposed semi-parametric topological memory (SPTM) consists of a (non-parametric) graph with nodes corresponding to locations in the environment and a (parametric) deep network capable of retrieving nodes from the graph based on observations. The graph stores no metric information, only connectivity of locations corresponding to the nodes. We use SPTM as a planning module in a navigation system. Given only 5 minutes of footage of a previously unseen maze, an SPTM-based navigation agent can build a topological map of the environment and use it to confidently navigate towards goals. The average success rate of the SPTM agent in goal-directed navigation across test environments is higher than the best-performing baseline by a factor of three. A video of the agent is available at https://youtu.be/vRF7f4lhswo |
Tasks | |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00653v1 |
http://arxiv.org/pdf/1803.00653v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-parametric-topological-memory-for |
Repo | |
Framework | |
Practical Algorithms for STV and Ranked Pairs with Parallel Universes Tiebreaking
Title | Practical Algorithms for STV and Ranked Pairs with Parallel Universes Tiebreaking |
Authors | Jun Wang, Sujoy Sikdar, Tyler Shepherd, Zhibing Zhao, Chunheng Jiang, Lirong Xia |
Abstract | STV and ranked pairs (RP) are two well-studied voting rules for group decision-making. They proceed in multiple rounds, and are affected by how ties are broken in each round. However, the literature is surprisingly vague about how ties should be broken. We propose the first algorithms for computing the set of alternatives that are winners under some tiebreaking mechanism under STV and RP, which is also known as parallel-universes tiebreaking (PUT). Unfortunately, PUT-winners are NP-complete to compute under STV and RP, and standard search algorithms from AI do not apply. We propose multiple DFS-based algorithms along with pruning strategies and heuristics to prioritize search direction to significantly improve the performance using machine learning. We also propose novel ILP formulations for PUT-winners under STV and RP, respectively. Experiments on synthetic and real-world data show that our algorithms are overall significantly faster than ILP, while there are a few cases where ILP is significantly faster for RP. |
Tasks | Decision Making |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06992v1 |
http://arxiv.org/pdf/1805.06992v1.pdf | |
PWC | https://paperswithcode.com/paper/practical-algorithms-for-stv-and-ranked-pairs |
Repo | |
Framework | |
Visual Saliency Maps Can Apply to Facial Expression Recognition
Title | Visual Saliency Maps Can Apply to Facial Expression Recognition |
Authors | Zhenyue Qin, Jie Wu |
Abstract | Human eyes concentrate different facial regions during distinct cognitive activities. We study utilising facial visual saliency maps to classify different facial expressions into different emotions. Our results show that our novel method of merely using facial saliency maps can achieve a descent accuracy of 65%, much higher than the chance level of $1/7$. Furthermore, our approach is of semi-supervision, i.e., our facial saliency maps are generated from a general saliency prediction algorithm that is not explicitly designed for face images. We also discovered that the classification accuracies of each emotional class using saliency maps demonstrate a strong positive correlation with the accuracies produced by face images. Our work implies that humans may look at different facial areas in order to perceive different emotions. |
Tasks | Facial Expression Recognition, Saliency Prediction |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.04544v1 |
http://arxiv.org/pdf/1811.04544v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-saliency-maps-can-apply-to-facial |
Repo | |
Framework | |
In-the-wild Facial Expression Recognition in Extreme Poses
Title | In-the-wild Facial Expression Recognition in Extreme Poses |
Authors | Fei Yang, Qian Zhang, Chi Zheng, Guoping Qiu |
Abstract | In the computer research area, facial expression recognition is a hot research problem. Recent years, the research has moved from the lab environment to in-the-wild circumstances. It is challenging, especially under extreme poses. But current expression detection systems are trying to avoid the pose effects and gain the general applicable ability. In this work, we solve the problem in the opposite approach. We consider the head poses and detect the expressions within special head poses. Our work includes two parts: detect the head pose and group it into one pre-defined head pose class; do facial expression recognize within each pose class. Our experiments show that the recognition results with pose class grouping are much better than that of direct recognition without considering poses. We combine the hand-crafted features, SIFT, LBP and geometric feature, with deep learning feature as the representation of the expressions. The handcrafted features are added into the deep learning framework along with the high level deep learning features. As a comparison, we implement SVM and random forest to as the prediction models. To train and test our methodology, we labeled the face dataset with 6 basic expressions. |
Tasks | Facial Expression Recognition |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02194v1 |
http://arxiv.org/pdf/1811.02194v1.pdf | |
PWC | https://paperswithcode.com/paper/in-the-wild-facial-expression-recognition-in |
Repo | |
Framework | |
Cross-database non-frontal facial expression recognition based on transductive deep transfer learning
Title | Cross-database non-frontal facial expression recognition based on transductive deep transfer learning |
Authors | Keyu Yan, Wenming Zheng, Tong Zhang, Yuan Zong, Zhen Cui |
Abstract | Cross-database non-frontal expression recognition is a very meaningful but rather difficult subject in the fields of computer vision and affect computing. In this paper, we proposed a novel transductive deep transfer learning architecture based on widely used VGGface16-Net for this problem. In this framework, the VGGface16-Net is used to jointly learn an common optimal nonlinear discriminative features from the non-frontal facial expression samples between the source and target databases and then we design a novel transductive transfer layer to deal with the cross-database non-frontal facial expression classification task. In order to validate the performance of the proposed transductive deep transfer learning networks, we present extensive crossdatabase experiments on two famous available facial expression databases, namely the BU-3DEF and the Multi-PIE database. The final experimental results show that our transductive deep transfer network outperforms the state-of-the-art cross-database facial expression recognition methods. |
Tasks | Facial Expression Recognition, Transfer Learning |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12774v1 |
http://arxiv.org/pdf/1811.12774v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-database-non-frontal-facial-expression |
Repo | |
Framework | |
Learning Software Constraints via Installation Attempts
Title | Learning Software Constraints via Installation Attempts |
Authors | Ran Ben Basat, Maayan Goldstein, Itai Segall |
Abstract | Modern software systems are expected to be secure and contain all the latest features, even when new versions of software are released multiple times an hour. Each system may include many interacting packages. The problem of installing multiple dependent packages has been extensively studied in the past, yielding some promising solutions that work well in practice. However, these assume that the developers declare all the dependencies and conflicts between the packages. Oftentimes, the entire repository structure may not be known upfront, for example when packages are developed by different vendors. In this paper, we present algorithms for learning dependencies, conflicts and defective packages from installation attempts. Our algorithms use combinatorial data structures to generate queries that test installations and discover the entire dependency structure. A query that the algorithms make corresponds to trying to install a subset of packages and getting a Boolean feedback on whether all constraints were satisfied in this subset. Our goal is to minimize the query complexity of the algorithms. We prove lower and upper bounds on the number of queries that these algorithms require to make for different settings of the problem. |
Tasks | |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.08902v2 |
http://arxiv.org/pdf/1804.08902v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-software-constraints-via |
Repo | |
Framework | |
Spectral reflectance estimation from one RGB image using self-interreflections in a concave object
Title | Spectral reflectance estimation from one RGB image using self-interreflections in a concave object |
Authors | Rada Deeb, Damien Muselet, Mathieu Hebert, Alain Tremeau |
Abstract | Light interreflections occurring in a concave object generate a color gradient which is characteristic of the object’s spectral reflectance. In this paper, we use this property in order to estimate the spectral reflectance of matte, uniformly colored, V-shaped surfaces from a single RGB image taken under directional lighting. First, simulations show that using one image of the concave object is equivalent to, and can even outperform, the state of the art approaches based on three images taken under three lightings with different colors. Experiments on real images of folded papers were performed under unmeasured direct sunlight. The results show that our interreflection-based approach outperforms existing approaches even when the latter are improved by a calibration step. The mathematical solution for the interreflection equation and the effect of surface parameters on the performance of the method are also discussed in this paper. |
Tasks | Calibration |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01595v1 |
http://arxiv.org/pdf/1803.01595v1.pdf | |
PWC | https://paperswithcode.com/paper/spectral-reflectance-estimation-from-one-rgb |
Repo | |
Framework | |
Review of Visual Saliency Detection with Comprehensive Information
Title | Review of Visual Saliency Detection with Comprehensive Information |
Authors | Runmin Cong, Jianjun Lei, Huazhu Fu, Ming-Ming Cheng, Weisi Lin, Qingming Huang |
Abstract | Visual saliency detection model simulates the human visual system to perceive the scene, and has been widely used in many vision tasks. With the acquisition technology development, more comprehensive information, such as depth cue, inter-image correspondence, or temporal relationship, is available to extend image saliency detection to RGBD saliency detection, co-saliency detection, or video saliency detection. RGBD saliency detection model focuses on extracting the salient regions from RGBD images by combining the depth information. Co-saliency detection model introduces the inter-image correspondence constraint to discover the common salient object in an image group. The goal of video saliency detection model is to locate the motion-related salient object in video sequences, which considers the motion cue and spatiotemporal constraint jointly. In this paper, we review different types of saliency detection algorithms, summarize the important issues of the existing methods, and discuss the existent problems and future works. Moreover, the evaluation datasets and quantitative measurements are briefly introduced, and the experimental analysis and discission are conducted to provide a holistic overview of different saliency detection methods. |
Tasks | Co-Saliency Detection, Saliency Detection, Video Saliency Detection |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03391v2 |
http://arxiv.org/pdf/1803.03391v2.pdf | |
PWC | https://paperswithcode.com/paper/review-of-visual-saliency-detection-with |
Repo | |
Framework | |
SimplerVoice: A Key Message & Visual Description Generator System for Illiteracy
Title | SimplerVoice: A Key Message & Visual Description Generator System for Illiteracy |
Authors | Minh N. B. Nguyen, Samuel Thomas, Anne E. Gattiker, Sujatha Kashyap, Kush R. Varshney |
Abstract | We introduce SimplerVoice: a key message and visual description generator system to help low-literate adults navigate the information-dense world with confidence, on their own. SimplerVoice can automatically generate sensible sentences describing an unknown object, extract semantic meanings of the object usage in the form of a query string, then, represent the string as multiple types of visual guidance (pictures, pictographs, etc.). We demonstrate SimplerVoice system in a case study of generating grocery products’ manuals through a mobile application. To evaluate, we conducted a user study on SimplerVoice’s generated description in comparison to the information interpreted by users from other methods: the original product package and search engines’ top result, in which SimplerVoice achieved the highest performance score: 4.82 on 5-point mean opinion score scale. Our result shows that SimplerVoice is able to provide low-literate end-users with simple yet informative components to help them understand how to use the grocery products, and that the system may potentially provide benefits in other real-world use cases |
Tasks | |
Published | 2018-11-03 |
URL | http://arxiv.org/abs/1811.01299v1 |
http://arxiv.org/pdf/1811.01299v1.pdf | |
PWC | https://paperswithcode.com/paper/simplervoice-a-key-message-visual-description |
Repo | |
Framework | |
Perceptually Optimized Generative Adversarial Network for Single Image Dehazing
Title | Perceptually Optimized Generative Adversarial Network for Single Image Dehazing |
Authors | Yixin Du, Xin Li |
Abstract | Existing approaches towards single image dehazing including both model-based and learning-based heavily rely on the estimation of so-called transmission maps. Despite its conceptual simplicity, using transmission maps as an intermediate step often makes it more difficult to optimize the perceptual quality of reconstructed images. To overcome this weakness, we propose a direct deep learning approach toward image dehazing bypassing the step of transmission map estimation and facilitating end-to-end perceptual optimization. Our technical contributions are mainly three-fold. First, based on the analogy between dehazing and denoising, we propose to directly learn a nonlinear mapping from the space of degraded images to that of haze-free ones via recursive deep residual learning; Second, inspired by the success of generative adversarial networks (GAN), we propose to optimize the perceptual quality of dehazed images by introducing a discriminator and a loss function adaptive to hazy conditions; Third, we propose to remove notorious halo-like artifacts at large scene depth discontinuities by a novel application of guided filtering. Extensive experimental results have shown that the subjective qualities of dehazed images by the proposed perceptually optimized GAN (POGAN) are often more favorable than those by existing state-of-the-art approaches especially when hazy condition varies. |
Tasks | Denoising, Image Dehazing, Single Image Dehazing |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01084v1 |
http://arxiv.org/pdf/1805.01084v1.pdf | |
PWC | https://paperswithcode.com/paper/perceptually-optimized-generative-adversarial |
Repo | |
Framework | |
Semantic Single-Image Dehazing
Title | Semantic Single-Image Dehazing |
Authors | Ziang Cheng, Shaodi You, Viorela Ila, Hongdong Li |
Abstract | Single-image haze-removal is challenging due to limited information contained in one single image. Previous solutions largely rely on handcrafted priors to compensate for this deficiency. Recent convolutional neural network (CNN) models have been used to learn haze-related priors but they ultimately work as advanced image filters. In this paper we propose a novel semantic ap- proach towards single image haze removal. Unlike existing methods, we infer color priors based on extracted semantic features. We argue that semantic context can be exploited to give informative cues for (a) learning color prior on clean image and (b) estimating ambient illumination. This design allowed our model to recover clean images from challenging cases with strong ambiguity, e.g. saturated illumination color and sky regions in image. In experiments, we validate our ap- proach upon synthetic and real hazy images, where our method showed superior performance over state-of-the-art approaches, suggesting semantic information facilitates the haze removal task. |
Tasks | Image Dehazing, Single Image Dehazing, Single Image Haze Removal |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05624v2 |
http://arxiv.org/pdf/1804.05624v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-single-image-dehazing |
Repo | |
Framework | |
Dynamic Prediction Length for Time Series with Sequence to Sequence Networks
Title | Dynamic Prediction Length for Time Series with Sequence to Sequence Networks |
Authors | Mark Harmon, Diego Klabjan |
Abstract | Recurrent neural networks and sequence to sequence models require a predetermined length for prediction output length. Our model addresses this by allowing the network to predict a variable length output in inference. A new loss function with a tailored gradient computation is developed that trades off prediction accuracy and output length. The model utilizes a function to determine whether a particular output at a time should be evaluated or not given a predetermined threshold. We evaluate the model on the problem of predicting the prices of securities. We find that the model makes longer predictions for more stable securities and it naturally balances prediction accuracy and length. |
Tasks | Time Series |
Published | 2018-07-02 |
URL | https://arxiv.org/abs/1807.00425v2 |
https://arxiv.org/pdf/1807.00425v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-prediction-length-for-time-series |
Repo | |
Framework | |
Optimizing Heuristics for Tableau-based OWL Reasoners
Title | Optimizing Heuristics for Tableau-based OWL Reasoners |
Authors | Razieh Mehri, Volker Haarslev, Hamidreza Chinaei |
Abstract | Optimization techniques play a significant role in improving description logic reasoners covering the Web Ontology Language (OWL). These techniques are essential to speed up these reasoners. Many of the optimization techniques are based on heuristic choices. Optimal heuristic selection makes these techniques more effective. The FaCT++ OWL reasoner and its Java version JFact implement an optimization technique called ToDo list which is a substitute for a traditional top-down approach in tableau-based reasoners. The ToDo list mechanism allows one to arrange the order of applying different rules by giving each a priority. Compared to a top-down approach, the ToDo list technique has a better control over the application of expansion rules. Learning the proper heuristic order for applying rules in ToDo lis} will have a great impact on reasoning speed. We use a binary SVM technique to build our learning model. The model can help to choose ontology-specific order sets to speed up OWL reasoning. On average, our learning approach tested with 40 selected ontologies achieves a speedup of two orders of magnitude when compared to the worst rule ordering choice. |
Tasks | |
Published | 2018-10-15 |
URL | http://arxiv.org/abs/1810.06617v2 |
http://arxiv.org/pdf/1810.06617v2.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-heuristics-for-tableau-based-owl |
Repo | |
Framework | |
A Simple Probabilistic Model for Uncertainty Estimation
Title | A Simple Probabilistic Model for Uncertainty Estimation |
Authors | Alexander Kuvaev, Roman Khudorozhkov |
Abstract | The article focuses on determining the predictive uncertainty of a model on the example of atrial fibrillation detection problem by a single-lead ECG signal. To this end, the model predicts parameters of the beta distribution over class probabilities instead of these probabilities themselves. It was shown that the described approach allows to detect atypical recordings and significantly improve the quality of the algorithm on confident predictions. |
Tasks | Atrial Fibrillation Detection |
Published | 2018-07-24 |
URL | http://arxiv.org/abs/1807.09312v1 |
http://arxiv.org/pdf/1807.09312v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-probabilistic-model-for-uncertainty |
Repo | |
Framework | |