Paper Group ANR 448
Style Transfer as Unsupervised Machine Translation. Structured Bayesian Gaussian process latent variable model. Part-based Tracking by Sampling. Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models. GEMRank: Global Entity Embedding For Collaborative Filtering. Automated detection of vulnerable plaque in intravascular ul …
Style Transfer as Unsupervised Machine Translation
Title | Style Transfer as Unsupervised Machine Translation |
Authors | Zhirui Zhang, Shuo Ren, Shujie Liu, Jianyong Wang, Peng Chen, Mu Li, Ming Zhou, Enhong Chen |
Abstract | Language style transferring rephrases text with specific stylistic attributes while preserving the original attribute-independent content. One main challenge in learning a style transfer system is a lack of parallel data where the source sentence is in one style and the target sentence in another style. With this constraint, in this paper, we adapt unsupervised machine translation methods for the task of automatic style transfer. We first take advantage of style-preference information and word embedding similarity to produce pseudo-parallel data with a statistical machine translation (SMT) framework. Then the iterative back-translation approach is employed to jointly train two neural machine translation (NMT) based transfer systems. To control the noise generated during joint training, a style classifier is introduced to guarantee the accuracy of style transfer and penalize bad candidates in the generated pseudo data. Experiments on benchmark datasets show that our proposed method outperforms previous state-of-the-art models in terms of both accuracy of style transfer and quality of input-output correspondence. |
Tasks | Machine Translation, Style Transfer, Unsupervised Machine Translation |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07894v1 |
http://arxiv.org/pdf/1808.07894v1.pdf | |
PWC | https://paperswithcode.com/paper/style-transfer-as-unsupervised-machine |
Repo | |
Framework | |
Structured Bayesian Gaussian process latent variable model
Title | Structured Bayesian Gaussian process latent variable model |
Authors | Steven Atkinson, Nicholas Zabaras |
Abstract | We introduce a Bayesian Gaussian process latent variable model that explicitly captures spatial correlations in data using a parameterized spatial kernel and leveraging structure-exploiting algebra on the model covariance matrices for computational tractability. Inference is made tractable through a collapsed variational bound with similar computational complexity to that of the traditional Bayesian GP-LVM. Inference over partially-observed test cases is achieved by optimizing a “partially-collapsed” bound. Modeling high-dimensional time series systems is enabled through use of a dynamical GP latent variable prior. Examples imputing missing data on images and super-resolution imputation of missing video frames demonstrate the model. |
Tasks | Imputation, Super-Resolution, Time Series |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08665v1 |
http://arxiv.org/pdf/1805.08665v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-bayesian-gaussian-process-latent-1 |
Repo | |
Framework | |
Part-based Tracking by Sampling
Title | Part-based Tracking by Sampling |
Authors | George De Ath, Richard M. Everson |
Abstract | We propose a novel part-based method for tracking an arbitrary object in challenging video sequences. The colour distribution of tracked image patches on the target object are represented by pairs of RGB samples and counts of how many pixels in the patch are similar to them. Patches are placed by segmenting the object in the given bounding box and placing patches in homogeneous regions of the object. These are located in subsequent image frames by applying non-shearing affine transformations to the patches’ previous locations, locally optimising the best of these, and evaluating their quality using a modified Bhattacharyya distance. In experiments carried out on VOT2018 and OTB100 benchmarks, the tracker achieves higher performance than all other part-based trackers. An ablation study is used to reveal the effectiveness of each tracking component, with largest performance gains found when using the patch placement scheme. |
Tasks | |
Published | 2018-05-22 |
URL | https://arxiv.org/abs/1805.08511v2 |
https://arxiv.org/pdf/1805.08511v2.pdf | |
PWC | https://paperswithcode.com/paper/part-based-tracking-by-sampling |
Repo | |
Framework | |
Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models
Title | Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models |
Authors | Daniel Ritchie, Kai Wang, Yu-an Lin |
Abstract | We present a new, fast and flexible pipeline for indoor scene synthesis that is based on deep convolutional generative models. Our method operates on a top-down image-based representation, and inserts objects iteratively into the scene by predicting their category, location, orientation and size with separate neural network modules. Our pipeline naturally supports automatic completion of partial scenes, as well as synthesis of complete scenes. Our method is significantly faster than the previous image-based method and generates result that outperforms it and other state-of-the-art deep generative scene models in terms of faithfulness to training data and perceived visual quality. |
Tasks | Indoor Scene Synthesis |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12463v1 |
http://arxiv.org/pdf/1811.12463v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-flexible-indoor-scene-synthesis-via |
Repo | |
Framework | |
GEMRank: Global Entity Embedding For Collaborative Filtering
Title | GEMRank: Global Entity Embedding For Collaborative Filtering |
Authors | Arash Khoeini, Bita Shams, Saman Haratizadeh |
Abstract | Recently, word embedding algorithms have been applied to map the entities of recommender systems, such as users and items, to new feature spaces using textual element-context relations among them. Unlike many other domains, this approach has not achieved a desired performance in collaborative filtering problems, probably due to unavailability of appropriate textual data. In this paper we propose a new recommendation framework, called GEMRank that can be applied when the user-item matrix is the sole available souce of information. It uses the concept of profile co-occurrence for defining relations among entities and applies a factorization method for embedding the users and items. GEMRank then feeds the extracted representations to a neural network model to predict user-item like/dislike relations which the final recommendations are made based on. We evaluated GEMRank in an extensive set of experiments against state of the art recommendation methods. The results show that GEMRank significantly outperforms the baseline algorithms in a variety of data sets with different degrees of density. |
Tasks | Recommendation Systems |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.01686v1 |
http://arxiv.org/pdf/1811.01686v1.pdf | |
PWC | https://paperswithcode.com/paper/gemrank-global-entity-embedding-for |
Repo | |
Framework | |
Automated detection of vulnerable plaque in intravascular ultrasound images
Title | Automated detection of vulnerable plaque in intravascular ultrasound images |
Authors | Tae Joon Jun, Soo-Jin Kang, June-Goo Lee, Jihoon Kweon, Wonjun Na, Daeyoun Kang, Dohyeun Kim, Daeyoung Kim, Young-Hak Kim |
Abstract | Acute Coronary Syndrome (ACS) is a syndrome caused by a decrease in blood flow in the coronary arteries. The ACS is usually related to coronary thrombosis and is primarily caused by plaque rupture followed by plaque erosion and calcified nodule. Thin-cap fibroatheroma (TCFA) is known to be the most similar lesion morphologically to a plaque rupture. In this paper, we propose methods to classify TCFA using various machine learning classifiers including Feed-forward Neural Network (FNN), K-Nearest Neighbor (KNN), Random Forest (RF) and Convolutional Neural Network (CNN) to figure out a classifier that shows optimal TCFA classification accuracy. In addition, we suggest pixel range based feature extraction method to extract the ratio of pixels in the different region of interests to reflect the physician’s TCFA discrimination criteria. A total of 12,325 IVUS images were labeled with corresponding OCT images to train and evaluate the classifiers. We achieved 0.884, 0.890, 0.878 and 0.933 Area Under the ROC Curve (AUC) in the order of using FNN, KNN, RF and CNN classifier. As a result, the CNN classifier performed best and the top 10 features of the feature-based classifiers (FNN, KNN, RF) were found to be similar to the physician’s TCFA diagnostic criteria. |
Tasks | |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06817v1 |
http://arxiv.org/pdf/1804.06817v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-detection-of-vulnerable-plaque-in |
Repo | |
Framework | |
A Polynomial Time MCMC Method for Sampling from Continuous DPPs
Title | A Polynomial Time MCMC Method for Sampling from Continuous DPPs |
Authors | Shayan Oveis Gharan, Alireza Rezaei |
Abstract | We study the Gibbs sampling algorithm for continuous determinantal point processes. We show that, given a warm start, the Gibbs sampler generates a random sample from a continuous $k$-DPP defined on a $d$-dimensional domain by only taking $\text{poly}(k)$ number of steps. As an application, we design an algorithm to generate random samples from $k$-DPPs defined by a spherical Gaussian kernel on a unit sphere in $d$-dimensions, $\mathbb{S}^{d-1}$ in time polynomial in $k,d$. |
Tasks | Point Processes |
Published | 2018-10-20 |
URL | http://arxiv.org/abs/1810.08867v1 |
http://arxiv.org/pdf/1810.08867v1.pdf | |
PWC | https://paperswithcode.com/paper/a-polynomial-time-mcmc-method-for-sampling |
Repo | |
Framework | |
Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2
Title | Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2 |
Authors | Alexander Peysakhovich |
Abstract | Inferring a person’s goal from their behavior is an important problem in applications of AI (e.g. automated assistants, recommender systems). The workhorse model for this task is the rational actor model - this amounts to assuming that people have stable reward functions, discount the future exponentially, and construct optimal plans. Under the rational actor assumption techniques such as inverse reinforcement learning (IRL) can be used to infer a person’s goals from their actions. A competing model is the dual-system model. Here decisions are the result of an interplay between a fast, automatic, heuristic-based system 1 and a slower, deliberate, calculating system 2. We generalize the dual system framework to the case of Markov decision problems and show how to compute optimal plans for dual-system agents. We show that dual-system agents exhibit behaviors that are incompatible with rational actor assumption. We show that naive applications of rational-actor IRL to the behavior of dual-system agents can generate wrong inference about the agents’ goals and suggest interventions that actually reduce the agent’s overall utility. Finally, we adapt a simple IRL algorithm to correctly infer the goals of dual system decision-makers. This allows us to make interventions that help, rather than hinder, the dual-system agent’s ability to reach their true goals. |
Tasks | Recommendation Systems |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.08549v2 |
http://arxiv.org/pdf/1811.08549v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-and-inverse |
Repo | |
Framework | |
Representation Mixing for TTS Synthesis
Title | Representation Mixing for TTS Synthesis |
Authors | Kyle Kastner, João Felipe Santos, Yoshua Bengio, Aaron Courville |
Abstract | Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation. However, the choice between character or phoneme input can create serious limitations for practical deployment, as direct control of pronunciation is crucial in certain cases. We demonstrate a simple method for combining multiple types of linguistic information in a single encoder, named representation mixing, enabling flexible choice between character, phoneme, or mixed representations during inference. Experiments and user studies on a public audiobook corpus show the efficacy of our approach. |
Tasks | |
Published | 2018-11-17 |
URL | http://arxiv.org/abs/1811.07240v2 |
http://arxiv.org/pdf/1811.07240v2.pdf | |
PWC | https://paperswithcode.com/paper/representation-mixing-for-tts-synthesis |
Repo | |
Framework | |
A Learning-based Approach to Joint Content Caching and Recommendation at Base Stations
Title | A Learning-based Approach to Joint Content Caching and Recommendation at Base Stations |
Authors | Dong Liu, Chenyang Yang |
Abstract | Recommendation system is able to shape user demands, which can be used for boosting caching gain. In this paper, we jointly optimize content caching and recommendation at base stations to maximize the caching gain meanwhile not compromising the user preference. We first propose a model to capture the impact of recommendation on user demands, which is controlled by a user-specific psychological threshold. We then formulate a joint caching and recommendation problem maximizing the successful offloading probability, which is a mixed integer programming problem. We develop a hierarchical iterative algorithm to solve the problem when the threshold is known. Since the user threshold is unknown in practice, we proceed to propose an $\varepsilon$-greedy algorithm to find the solution by learning the threshold via interactions with users. Simulation results show that the proposed algorithms improve the successful offloading probability compared with prior works with/without recommendation. The $\varepsilon$-greedy algorithm learns the user threshold quickly, and achieves more than $1-\varepsilon$ of the performance obtained by the algorithm with known threshold. |
Tasks | |
Published | 2018-01-22 |
URL | http://arxiv.org/abs/1802.01414v2 |
http://arxiv.org/pdf/1802.01414v2.pdf | |
PWC | https://paperswithcode.com/paper/a-learning-based-approach-to-joint-content |
Repo | |
Framework | |
Driver Hand Localization and Grasp Analysis: A Vision-based Real-time Approach
Title | Driver Hand Localization and Grasp Analysis: A Vision-based Real-time Approach |
Authors | Siddharth, Akshay Rangesh, Eshed Ohn-Bar, Mohan M. Trivedi |
Abstract | Extracting hand regions and their grasp information from images robustly in real-time is critical for occupants’ safety and in-vehicular infotainment applications. It must however, be noted that naturalistic driving scenes suffer from rapidly changing illumination and occlusion. This is aggravated by the fact that hands are highly deformable objects, and change in appearance frequently. This work addresses the task of accurately localizing driver hands and classifying the grasp state of each hand. We use a fast ConvNet to first detect likely hand regions. Next, a pixel-based skin classifier that takes into account the global illumination changes is used to refine the hand detections and remove false positives. This step generates a pixel-level mask for each hand. Finally, we study each such masked regions and detect if the driver is grasping the wheel, or in some cases a mobile phone. Through evaluation we demonstrate that our method can outperform state-of-the-art pixel based hand detectors, while running faster (at 35 fps) than other deep ConvNet based frameworks even for grasp analysis. Hand mask cues are shown to be crucial when analyzing a set of driver hand gestures (wheel/mobile phone grasp and no-grasp) in naturalistic driving settings. The proposed detection and localization pipeline hence can act as a general framework for real-time hand detection and gesture classification. |
Tasks | |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07854v1 |
http://arxiv.org/pdf/1802.07854v1.pdf | |
PWC | https://paperswithcode.com/paper/driver-hand-localization-and-grasp-analysis-a |
Repo | |
Framework | |
AclNet: efficient end-to-end audio classification CNN
Title | AclNet: efficient end-to-end audio classification CNN |
Authors | Jonathan J Huang, Juan Jose Alvarado Leanos |
Abstract | We propose an efficient end-to-end convolutional neural network architecture, AclNet, for audio classification. When trained with our data augmentation and regularization, we achieved state-of-the-art performance on the ESC-50 corpus with 85:65% accuracy. Our network allows configurations such that memory and compute requirements are drastically reduced, and a tradeoff analysis of accuracy and complexity is presented. The analysis shows high accuracy at significantly reduced computational complexity compared to existing solutions. For example, a configuration with only 155k parameters and 49:3 million multiply-adds per second is 81:75%, exceeding human accuracy of 81:3%. This improved efficiency can enable always-on inference in energy-efficient platforms. |
Tasks | Audio Classification, Data Augmentation |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.06669v1 |
http://arxiv.org/pdf/1811.06669v1.pdf | |
PWC | https://paperswithcode.com/paper/aclnet-efficient-end-to-end-audio |
Repo | |
Framework | |
Probabilistic Graphs for Sensor Data-driven Modelling of Power Systems at Scale
Title | Probabilistic Graphs for Sensor Data-driven Modelling of Power Systems at Scale |
Authors | Francesco Fusco |
Abstract | The growing complexity of the power grid, driven by increasing share of distributed energy resources and by massive deployment of intelligent internet-connected devices, requires new modelling tools for planning and operation. Physics-based state estimation models currently used for data filtering, prediction and anomaly detection are hard to maintain and adapt to the ever-changing complex dynamics of the power system. A data-driven approach based on probabilistic graphs is proposed, where custom non-linear, localised models of the joint density of subset of system variables can be combined to model arbitrarily large and complex systems. The graphical model allows to naturally embed domain knowledge in the form of variables dependency structure or local quantitative relationships. A specific instance where neural-network models are used to represent the local joint densities is proposed, although the methodology generalises to other model classes. Accuracy and scalability are evaluated on a large-scale data set representative of the European transmission grid. |
Tasks | Anomaly Detection |
Published | 2018-11-18 |
URL | http://arxiv.org/abs/1811.07267v1 |
http://arxiv.org/pdf/1811.07267v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-graphs-for-sensor-data-driven |
Repo | |
Framework | |
U-SegNet: Fully Convolutional Neural Network based Automated Brain tissue segmentation Tool
Title | U-SegNet: Fully Convolutional Neural Network based Automated Brain tissue segmentation Tool |
Authors | Pulkit Kumar, Pravin Nagar, Chetan Arora, Anubha Gupta |
Abstract | Automated brain tissue segmentation into white matter (WM), gray matter (GM), and cerebro-spinal fluid (CSF) from magnetic resonance images (MRI) is helpful in the diagnosis of neuro-disorders such as epilepsy, Alzheimer’s, multiple sclerosis, etc. However, thin GM structures at the periphery of cortex and smooth transitions on tissue boundaries such as between GM and WM, or WM and CSF pose difficulty in building a reliable segmentation tool. This paper proposes a Fully Convolutional Neural Network (FCN) tool, that is a hybrid of two widely used deep learning segmentation architectures SegNet and U-Net, for improved brain tissue segmentation. We propose a skip connection inspired from U-Net, in the SegNet architetcure, to incorporate fine multiscale information for better tissue boundary identification. We show that the proposed U-SegNet architecture, improves segmentation performance, as measured by average dice ratio, to 89.74% on the widely used IBSR dataset consisting of T-1 weighted MRI volumes of 18 subjects. |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04429v1 |
http://arxiv.org/pdf/1806.04429v1.pdf | |
PWC | https://paperswithcode.com/paper/u-segnet-fully-convolutional-neural-network |
Repo | |
Framework | |
A Moral Framework for Understanding of Fair ML through Economic Models of Equality of Opportunity
Title | A Moral Framework for Understanding of Fair ML through Economic Models of Equality of Opportunity |
Authors | Hoda Heidari, Michele Loi, Krishna P. Gummadi, Andreas Krause |
Abstract | We map the recently proposed notions of algorithmic fairness to economic models of Equality of opportunity (EOP)—an extensively studied ideal of fairness in political philosophy. We formally show that through our conceptual mapping, many existing definition of algorithmic fairness, such as predictive value parity and equality of odds, can be interpreted as special cases of EOP. In this respect, our work serves as a unifying moral framework for understanding existing notions of algorithmic fairness. Most importantly, this framework allows us to explicitly spell out the moral assumptions underlying each notion of fairness, and interpret recent fairness impossibility results in a new light. Last but not least and inspired by luck egalitarian models of EOP, we propose a new family of measures for algorithmic fairness. We illustrate our proposal empirically and show that employing a measure of algorithmic (un)fairness when its underlying moral assumptions are not satisfied, can have devastating consequences for the disadvantaged group’s welfare. |
Tasks | |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03400v2 |
http://arxiv.org/pdf/1809.03400v2.pdf | |
PWC | https://paperswithcode.com/paper/a-moral-framework-for-understanding-of-fair |
Repo | |
Framework | |