October 17, 2019

2857 words 14 mins read

Paper Group ANR 810

Paper Group ANR 810

Cavity Filling: Pseudo-Feature Generation for Multi-Class Imbalanced Data Problems in Deep Learning. Subsurface structure analysis using computational interpretation and learning: A visual signal processing perspective. A Multimodal Approach to Predict Social Media Popularity. Did you take the pill? - Detecting Personal Intake of Medicine from Twit …

Cavity Filling: Pseudo-Feature Generation for Multi-Class Imbalanced Data Problems in Deep Learning

Title Cavity Filling: Pseudo-Feature Generation for Multi-Class Imbalanced Data Problems in Deep Learning
Authors Tomohiko Konno, Michiaki Iwazume
Abstract Herein, we generate pseudo-features based on the multivariate probability distributions obtained from the feature maps in layers of trained deep neural networks. Further, we augment the minor-class data based on these generated pseudo-features to overcome the imbalanced data problems. The proposed method, i.e., cavity filling, improves the deep learning capabilities in several problems because all the real-world data are observed to be imbalanced.
Tasks
Published 2018-07-17
URL https://arxiv.org/abs/1807.06538v6
PDF https://arxiv.org/pdf/1807.06538v6.pdf
PWC https://paperswithcode.com/paper/pseudo-feature-generation-for-imbalanced-data
Repo
Framework

Subsurface structure analysis using computational interpretation and learning: A visual signal processing perspective

Title Subsurface structure analysis using computational interpretation and learning: A visual signal processing perspective
Authors G. AlRegib, M. Deriche, Z. Long, H. Di, Z. Wang, Y. Alaudah, M. Shafiq, M. Alfarraj
Abstract Understanding Earth’s subsurface structures has been and continues to be an essential component of various applications such as environmental monitoring, carbon sequestration, and oil and gas exploration. By viewing the seismic volumes that are generated through the processing of recorded seismic traces, researchers were able to learn from applying advanced image processing and computer vision algorithms to effectively analyze and understand Earth’s subsurface structures. In this paper, first, we summarize the recent advances in this direction that relied heavily on the fields of image processing and computer vision. Second, we discuss the challenges in seismic interpretation and provide insights and some directions to address such challenges using emerging machine learning algorithms.
Tasks Seismic Interpretation
Published 2018-12-20
URL http://arxiv.org/abs/1812.08756v1
PDF http://arxiv.org/pdf/1812.08756v1.pdf
PWC https://paperswithcode.com/paper/subsurface-structure-analysis-using
Repo
Framework

A Multimodal Approach to Predict Social Media Popularity

Title A Multimodal Approach to Predict Social Media Popularity
Authors Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann
Abstract Multiple modalities represent different aspects by which information is conveyed by a data source. Modern day social media platforms are one of the primary sources of multimodal data, where users use different modes of expression by posting textual as well as multimedia content such as images and videos for sharing information. Multimodal information embedded in such posts could be useful in predicting their popularity. To the best of our knowledge, no such multimodal dataset exists for the prediction of social media photos. In this work, we propose a multimodal dataset consisiting of content, context, and social information for popularity prediction. Specifically, we augment the SMPT1 dataset for social media prediction in ACM Multimedia grand challenge 2017 with image content, titles, descriptions, and tags. Next, in this paper, we propose a multimodal approach which exploits visual features (i.e., content information), textual features (i.e., contextual information), and social features (e.g., average views and group counts) to predict popularity of social media photos in terms of view counts. Experimental results confirm that despite our multimodal approach uses the half of the training dataset from SMP-T1, it achieves comparable performance with that of state-of-the-art.
Tasks
Published 2018-07-16
URL http://arxiv.org/abs/1807.05959v1
PDF http://arxiv.org/pdf/1807.05959v1.pdf
PWC https://paperswithcode.com/paper/a-multimodal-approach-to-predict-social-media
Repo
Framework

Did you take the pill? - Detecting Personal Intake of Medicine from Twitter

Title Did you take the pill? - Detecting Personal Intake of Medicine from Twitter
Authors Debanjan Mahata, Jasper Friedrichs, Rajiv Ratn Shah, Jing Jiang
Abstract Mining social media messages such as tweets, articles, and Facebook posts for health and drug related information has received significant interest in pharmacovigilance research. Social media sites (e.g., Twitter), have been used for monitoring drug abuse, adverse reactions of drug usage and analyzing expression of sentiments related to drugs. Most of these studies are based on aggregated results from a large population rather than specific sets of individuals. In order to conduct studies at an individual level or specific cohorts, identifying posts mentioning intake of medicine by the user is necessary. Towards this objective we develop a classifier for identifying mentions of personal intake of medicine in tweets. We train a stacked ensemble of shallow convolutional neural network (CNN) models on an annotated dataset. We use random search for tuning the hyper-parameters of the CNN models and present an ensemble of best models for the prediction task. Our system produces state-of-the-art result, with a micro-averaged F-score of 0.693. We believe that the developed classifier has direct uses in the areas of psychology, health informatics, pharmacovigilance and affective computing for tracking moods, emotions and sentiments of patients expressing intake of medicine in social media.
Tasks
Published 2018-08-03
URL http://arxiv.org/abs/1808.02082v1
PDF http://arxiv.org/pdf/1808.02082v1.pdf
PWC https://paperswithcode.com/paper/did-you-take-the-pill-detecting-personal
Repo
Framework

Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration

Title Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration
Authors De-An Huang, Suraj Nair, Danfei Xu, Yuke Zhu, Animesh Garg, Li Fei-Fei, Silvio Savarese, Juan Carlos Niebles
Abstract Our goal is to generate a policy to complete an unseen task given just a single video demonstration of the task in a given domain. We hypothesize that to successfully generalize to unseen complex tasks from a single video demonstration, it is necessary to explicitly incorporate the compositional structure of the tasks into the model. To this end, we propose Neural Task Graph (NTG) Networks, which use conjugate task graph as the intermediate representation to modularize both the video demonstration and the derived policy. We empirically show NTG achieves inter-task generalization on two complex tasks: Block Stacking in BulletPhysics and Object Collection in AI2-THOR. NTG improves data efficiency with visual input as well as achieve strong generalization without the need for dense hierarchical supervision. We further show that similar performance trends hold when applied to real-world data. We show that NTG can effectively predict task structure on the JIGSAWS surgical dataset and generalize to unseen tasks.
Tasks
Published 2018-07-10
URL http://arxiv.org/abs/1807.03480v2
PDF http://arxiv.org/pdf/1807.03480v2.pdf
PWC https://paperswithcode.com/paper/neural-task-graphs-generalizing-to-unseen
Repo
Framework

EasiCSDeep: A deep learning model for Cervical Spondylosis Identification using surface electromyography signal

Title EasiCSDeep: A deep learning model for Cervical Spondylosis Identification using surface electromyography signal
Authors Nana Wang, Li Cui, Xi Huang, Yingcong Xiang, Jing Xiao
Abstract Cervical spondylosis (CS) is a common chronic disease that affects up to two-thirds of the population and poses a serious burden on individuals and society. The early identification has significant value in improving cure rate and reducing costs. However, the pathology is complex, and the mild symptoms increase the difficulty of the diagnosis, especially in the early stage. Besides, the time-consuming and costliness of hospital medical service reduces the attention to the CS identification. Thus, a convenient, low-cost intelligent CS identification method is imperious demanded. In this paper, we present an intelligent method based on the deep learning to identify CS, using the surface electromyography (sEMG) signal. Faced with the complex, high dimensionality and weak usability of the sEMG signal, we proposed and developed a multi-channel EasiCSDeep algorithm based on the convolutional neural network, which consists of the feature extraction, spatial relationship representation and classification algorithm. To the best of our knowledge, this EasiCSDeep is the first effort to employ the deep learning and the sEMG data to identify CS. Compared with previous state-of-the-art algorithm, our algorithm achieves a significant improvement.
Tasks Cervical Spondylosis Identification
Published 2018-12-12
URL http://arxiv.org/abs/1812.04912v1
PDF http://arxiv.org/pdf/1812.04912v1.pdf
PWC https://paperswithcode.com/paper/easicsdeep-a-deep-learning-model-for-cervical
Repo
Framework

Task Recommendation in Crowdsourcing Based on Learning Preferences and Reliabilities

Title Task Recommendation in Crowdsourcing Based on Learning Preferences and Reliabilities
Authors Qiyu Kang, Wee Peng Tay
Abstract Workers participating in a crowdsourcing platform can have a wide range of abilities and interests. An important problem in crowdsourcing is the task recommendation problem, in which tasks that best match a particular worker’s preferences and reliabilities are recommended to that worker. A task recommendation scheme that assigns tasks more likely to be accepted by a worker who is more likely to complete it reliably results in better performance for the task requester. Without prior information about a worker, his preferences and reliabilities need to be learned over time. In this paper, we propose a multi-armed bandit (MAB) framework to learn a worker’s preferences and his reliabilities for different categories of tasks. However, unlike the classical MAB problem, the reward from the worker’s completion of a task is unobservable. We therefore include the use of gold tasks (i.e., tasks whose solutions are known \emph{a priori} and which do not produce any rewards) in our task recommendation procedure. Our model could be viewed as a new variant of MAB, in which the random rewards can only be observed at those time steps where gold tasks are used, and the accuracy of estimating the expected reward of recommending a task to a worker depends on the number of gold tasks used. We show that the optimal regret is $O(\sqrt{n})$, where $n$ is the number of tasks recommended to the worker. We develop three task recommendation strategies to determine the number of gold tasks for different task categories, and show that they are order optimal. Simulations verify the efficiency of our approaches.
Tasks
Published 2018-07-27
URL http://arxiv.org/abs/1807.10444v1
PDF http://arxiv.org/pdf/1807.10444v1.pdf
PWC https://paperswithcode.com/paper/task-recommendation-in-crowdsourcing-based-on
Repo
Framework

3D Shape Reconstruction from a Single 2D Image via 2D-3D Self-Consistency

Title 3D Shape Reconstruction from a Single 2D Image via 2D-3D Self-Consistency
Authors Yi-Lun Liao, Yao-Cheng Yang, Yu-Chiang Frank Wang
Abstract Aiming at inferring 3D shapes from 2D images, 3D shape reconstruction has drawn huge attention from researchers in computer vision and deep learning communities. However, it is not practical to assume that 2D input images and their associated ground truth 3D shapes are always available during training. In this paper, we propose a framework for semi-supervised 3D reconstruction. This is realized by our introduced 2D-3D self-consistency, which aligns the predicted 3D models and the projected 2D foreground segmentation masks. Moreover, our model not only enables recovering 3D shapes with the corresponding 2D masks, camera pose information can be jointly disentangled and predicted, even such supervision is never available during training. In the experiments, we qualitatively and quantitatively demonstrate the effectiveness of our model, which performs favorably against state-of-the-art approaches in either supervised or semi-supervised settings.
Tasks 3D Reconstruction, 3D Shape Reconstruction From A Single 2D Image
Published 2018-11-29
URL http://arxiv.org/abs/1811.12016v1
PDF http://arxiv.org/pdf/1811.12016v1.pdf
PWC https://paperswithcode.com/paper/3d-shape-reconstruction-from-a-single-2d
Repo
Framework

Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018

Title Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018
Authors Mattia Antonino Di Gangi, Roberto Dessì, Roldano Cattoni, Matteo Negri, Marco Turchi
Abstract This paper describes FBK’s submission to the end-to-end English-German speech translation task at IWSLT 2018. Our system relies on a state-of-the-art model based on LSTMs and CNNs, where the CNNs are used to reduce the temporal dimension of the audio input, which is in general much higher than machine translation input. Our model was trained only on the audio-to-text parallel data released for the task, and fine-tuned on cleaned subsets of the original training corpus. The addition of weight normalization and label smoothing improved the baseline system by 1.0 BLEU point on our validation set. The final submission also featured checkpoint averaging within a training run and ensemble decoding of models trained during multiple runs. On test data, our best single model obtained a BLEU score of 9.7, while the ensemble obtained a BLEU score of 10.24.
Tasks Machine Translation
Published 2018-10-16
URL http://arxiv.org/abs/1810.07652v1
PDF http://arxiv.org/pdf/1810.07652v1.pdf
PWC https://paperswithcode.com/paper/fine-tuning-on-clean-data-for-end-to-end
Repo
Framework

Bounds on the Approximation Power of Feedforward Neural Networks

Title Bounds on the Approximation Power of Feedforward Neural Networks
Authors Mohammad Mehrabi, Aslan Tchamkerten, Mansoor I. Yousefi
Abstract The approximation power of general feedforward neural networks with piecewise linear activation functions is investigated. First, lower bounds on the size of a network are established in terms of the approximation error and network depth and width. These bounds improve upon state-of-the-art bounds for certain classes of functions, such as strongly convex functions. Second, an upper bound is established on the difference of two neural networks with identical weights but different activation functions.
Tasks
Published 2018-06-29
URL http://arxiv.org/abs/1806.11416v1
PDF http://arxiv.org/pdf/1806.11416v1.pdf
PWC https://paperswithcode.com/paper/bounds-on-the-approximation-power-of
Repo
Framework

Linear Spectral Estimators and an Application to Phase Retrieval

Title Linear Spectral Estimators and an Application to Phase Retrieval
Authors Ramina Ghods, Andrew S. Lan, Tom Goldstein, Christoph Studer
Abstract Phase retrieval refers to the problem of recovering real- or complex-valued vectors from magnitude measurements. The best-known algorithms for this problem are iterative in nature and rely on so-called spectral initializers that provide accurate initialization vectors. We propose a novel class of estimators suitable for general nonlinear measurement systems, called linear spectral estimators (LSPEs), which can be used to compute accurate initialization vectors for phase retrieval problems. The proposed LSPEs not only provide accurate initialization vectors for noisy phase retrieval systems with structured or random measurement matrices, but also enable the derivation of sharp and nonasymptotic mean-squared error bounds. We demonstrate the efficacy of LSPEs on synthetic and real-world phase retrieval problems, and show that our estimators significantly outperform existing methods for structured measurement systems that arise in practice.
Tasks
Published 2018-06-09
URL http://arxiv.org/abs/1806.03547v1
PDF http://arxiv.org/pdf/1806.03547v1.pdf
PWC https://paperswithcode.com/paper/linear-spectral-estimators-and-an-application
Repo
Framework

Hierarchical Multi Task Learning With CTC

Title Hierarchical Multi Task Learning With CTC
Authors Ramon Sanabria, Florian Metze
Abstract In Automatic Speech Recognition it is still challenging to learn useful intermediate representations when using high-level (or abstract) target units such as words. For that reason, character or phoneme based systems tend to outperform word-based systems when just few hundreds of hours of training data are being used. In this paper, we first show how hierarchical multi-task training can encourage the formation of useful intermediate representations. We achieve this by performing Connectionist Temporal Classification at different levels of the network with targets of different granularity. Our model thus performs predictions in multiple scales for the same input. On the standard 300h Switchboard training setup, our hierarchical multi-task architecture exhibits improvements over single-task architectures with the same number of parameters. Our model obtains 14.0% Word Error Rate on the Eval2000 Switchboard subset without any decoder or language model, outperforming the current state-of-the-art on acoustic-to-word models.
Tasks Language Modelling, Multi-Task Learning, Speech Recognition
Published 2018-07-18
URL http://arxiv.org/abs/1807.07104v5
PDF http://arxiv.org/pdf/1807.07104v5.pdf
PWC https://paperswithcode.com/paper/hierarchical-multi-task-learning-with-ctc
Repo
Framework

Solving Poisson’s Equation using Deep Learning in Particle Simulation of PN Junction

Title Solving Poisson’s Equation using Deep Learning in Particle Simulation of PN Junction
Authors Zhongyang Zhang, Ling Zhang, Ze Sun, Nicholas Erickson, Ryan From, Jun Fan
Abstract Simulating the dynamic characteristics of a PN junction at the microscopic level requires solving the Poisson’s equation at every time step. Solving at every time step is a necessary but time-consuming process when using the traditional finite difference (FDM) approach. Deep learning is a powerful technique to fit complex functions. In this work, deep learning is utilized to accelerate solving Poisson’s equation in a PN junction. The role of the boundary condition is emphasized in the loss function to ensure a better fitting. The resulting I-V curve for the PN junction, using the deep learning solver presented in this work, shows a perfect match to the I-V curve obtained using the finite difference method, with the advantage of being 10 times faster at every time step.
Tasks
Published 2018-10-24
URL http://arxiv.org/abs/1810.10192v2
PDF http://arxiv.org/pdf/1810.10192v2.pdf
PWC https://paperswithcode.com/paper/solving-poissons-equation-using-deep-learning
Repo
Framework

Interpretable deep learning for guided structure-property explorations in photovoltaics

Title Interpretable deep learning for guided structure-property explorations in photovoltaics
Authors Balaji Sesha Sarath Pokuri, Sambuddha Ghosal, Apurva Kokate, Baskar Ganapathysubramanian, Soumik Sarkar
Abstract The performance of an organic photovoltaic device is intricately connected to its active layer morphology. This connection between the active layer and device performance is very expensive to evaluate, either experimentally or computationally. Hence, designing morphologies to achieve higher performances is non-trivial and often intractable. To solve this, we first introduce a deep convolutional neural network (CNN) architecture that can serve as a fast and robust surrogate for the complex structure-property map. Several tests were performed to gain trust in this trained model. Then, we utilize this fast framework to perform robust microstructural design to enhance device performance.
Tasks
Published 2018-11-14
URL http://arxiv.org/abs/1811.06067v3
PDF http://arxiv.org/pdf/1811.06067v3.pdf
PWC https://paperswithcode.com/paper/interpretable-deep-learning-for-guided
Repo
Framework

Arena Model: Inference About Competitions

Title Arena Model: Inference About Competitions
Authors Chenhe Zhang, Peiyuan Sun
Abstract The authors propose a parametric model called the arena model for prediction in paired competitions, i.e. paired comparisons with eliminations and bifurcations. The arena model has a number of appealing advantages. First, it predicts the results of competitions without rating many individuals. Second, it takes full advantage of the structure of competitions. Third, the model provides an easy method to quantify the uncertainty in competitions. Fourth, some of our methods can be directly generalized for comparisons among three or more individuals. Furthermore, the authors identify an invariant Bayes estimator with regard to the prior distribution and prove the consistency of the estimations of uncertainty. Currently, the arena model is not effective in tracking the change of strengths of individuals, but its basic framework provides a solid foundation for future study of such cases.
Tasks
Published 2018-11-25
URL http://arxiv.org/abs/1811.11019v1
PDF http://arxiv.org/pdf/1811.11019v1.pdf
PWC https://paperswithcode.com/paper/arena-model-inference-about-competitions
Repo
Framework
comments powered by Disqus