Paper Group ANR 341
Brain-Inspired Hardware for Artificial Intelligence: Accelerated Learning in a Physical-Model Spiking Neural Network. A Deep Spatio-Temporal Fuzzy Neural Network for Passenger Demand Prediction. Pillar in Pillar: Multi-Scale and Dynamic Feature Extraction for 3D Object Detection in Point Clouds. Recommender Systems Notation: Proposed Common Notatio …
Brain-Inspired Hardware for Artificial Intelligence: Accelerated Learning in a Physical-Model Spiking Neural Network
Title | Brain-Inspired Hardware for Artificial Intelligence: Accelerated Learning in a Physical-Model Spiking Neural Network |
Authors | Timo C. Wunderlich, Akos F. Kungl, Eric Müller, Johannes Schemmel, Mihai Petrovici |
Abstract | Future developments in artificial intelligence will profit from the existence of novel, non-traditional substrates for brain-inspired computing. Neuromorphic computers aim to provide such a substrate that reproduces the brain’s capabilities in terms of adaptive, low-power information processing. We present results from a prototype chip of the BrainScaleS-2 mixed-signal neuromorphic system that adopts a physical-model approach with a 1000-fold acceleration of spiking neural network dynamics relative to biological real time. Using the embedded plasticity processor, we both simulate the Pong arcade video game and implement a local plasticity rule that enables reinforcement learning, allowing the on-chip neural network to learn to play the game. The experiment demonstrates key aspects of the employed approach, such as accelerated and flexible learning, high energy efficiency and resilience to noise. |
Tasks | |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.11145v2 |
https://arxiv.org/pdf/1909.11145v2.pdf | |
PWC | https://paperswithcode.com/paper/brain-inspired-hardware-for-artificial |
Repo | |
Framework | |
A Deep Spatio-Temporal Fuzzy Neural Network for Passenger Demand Prediction
Title | A Deep Spatio-Temporal Fuzzy Neural Network for Passenger Demand Prediction |
Authors | Xiaoyuan Liang, Guiling Wang, Martin Renqiang Min, Yi Qi, Zhu Han |
Abstract | In spite of its importance, passenger demand prediction is a highly challenging problem, because the demand is simultaneously influenced by the complex interactions among many spatial and temporal factors and other external factors such as weather. To address this problem, we propose a Spatio-TEmporal Fuzzy neural Network (STEF-Net) to accurately predict passenger demands incorporating the complex interactions of all known important factors. We design an end-to-end learning framework with different neural networks modeling different factors. Specifically, we propose to capture spatio-temporal feature interactions via a convolutional long short-term memory network and model external factors via a fuzzy neural network that handles data uncertainty significantly better than deterministic methods. To keep the temporal relations when fusing two networks and emphasize discriminative spatio-temporal feature interactions, we employ a novel feature fusion method with a convolution operation and an attention layer. As far as we know, our work is the first to fuse a deep recurrent neural network and a fuzzy neural network to model complex spatial-temporal feature interactions with additional uncertain input features for predictive learning. Experiments on a large-scale real-world dataset show that our model achieves more than 10% improvement over the state-of-the-art approaches. |
Tasks | |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.05614v1 |
https://arxiv.org/pdf/1905.05614v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-spatio-temporal-fuzzy-neural-network |
Repo | |
Framework | |
Pillar in Pillar: Multi-Scale and Dynamic Feature Extraction for 3D Object Detection in Point Clouds
Title | Pillar in Pillar: Multi-Scale and Dynamic Feature Extraction for 3D Object Detection in Point Clouds |
Authors | Yonglin Tian, Lichao Huang, Xuesong Li, Yuan Li, Zilei Wang, Fei-Yue Wang |
Abstract | Sparsity and varied density are two of the main obstacles for 3D detection networks with point clouds. In this paper, we present a multi-scale voxelization method and a decomposable dynamic convolution to solve them. We consider the misalignment problem between voxel representation with different scales and present a center-aligned voxelization strategy. Instead of separating points into individual groups, we use an overlapped partition mechanism to avoid the perception deficiency of edge points in each voxel. Based on this multi-scale voxelization, we are able to build an effective fusion network by one-iteration top-down forward. To handle the variation of density in point cloud data, we propose a decomposable dynamic convolutional layer that considers the shared and dynamic components when applying convolutional filters at different positions of feature maps. By modeling bases in the kernel space, the number of parameters for generating dynamic filters is greatly reduced. With a self-learning network, we can apply dynamic convolutions to input features and deal with the variation in the feature space. We conduct experiments with our PiPNet on KITTI dataset and achieve better results than other voxelization-based methods on 3D detection task. |
Tasks | 3D Object Detection, Object Detection |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04775v2 |
https://arxiv.org/pdf/1912.04775v2.pdf | |
PWC | https://paperswithcode.com/paper/pillar-in-pillar-multi-scale-and-dynamic |
Repo | |
Framework | |
Recommender Systems Notation: Proposed Common Notation for Teaching and Research
Title | Recommender Systems Notation: Proposed Common Notation for Teaching and Research |
Authors | Michael D. Ekstrand, Joseph A. Konstan |
Abstract | As the field of recommender systems has developed, authors have used a myriad of notations for describing the mathematical workings of recommendation algorithms. These notations ap-pear in research papers, books, lecture notes, blog posts, and software documentation. The dis-ciplinary diversity of the field has not contributed to consistency in notation; scholars whose home base is in information retrieval have different habits and expectations than those in ma-chine learning or human-computer interaction. In the course of years of teaching and research on recommender systems, we have seen the val-ue in adopting a consistent notation across our work. This has been particularly highlighted in our development of the Recommender Systems MOOC on Coursera (Konstan et al. 2015), as we need to explain a wide variety of algorithms and our learners are not well-served by changing notation between algorithms. In this paper, we describe the notation we have adopted in our work, along with its justification and some discussion of considered alternatives. We present this in hope that it will be useful to others writing and teaching about recommender systems. This notation has served us well for some time now, in research, online education, and traditional classroom instruction. We feel it is ready for broad use. |
Tasks | Information Retrieval, Recommendation Systems |
Published | 2019-02-04 |
URL | http://arxiv.org/abs/1902.01348v1 |
http://arxiv.org/pdf/1902.01348v1.pdf | |
PWC | https://paperswithcode.com/paper/recommender-systems-notation-proposed-common |
Repo | |
Framework | |
An Emotional Analysis of False Information in Social Media and News Articles
Title | An Emotional Analysis of False Information in Social Media and News Articles |
Authors | Bilal Ghanem, Paolo Rosso, Francisco Rangel |
Abstract | Fake news is risky since it has been created to manipulate the readers’ opinions and beliefs. In this work, we compared the language of false news to the real one of real news from an emotional perspective, considering a set of false information types (propaganda, hoax, clickbait, and satire) from social media and online news articles sources. Our experiments showed that false information has different emotional patterns in each of its types, and emotions play a key role in deceiving the reader. Based on that, we proposed a LSTM neural network model that is emotionally-infused to detect false news. |
Tasks | |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09951v1 |
https://arxiv.org/pdf/1908.09951v1.pdf | |
PWC | https://paperswithcode.com/paper/an-emotional-analysis-of-false-information-in |
Repo | |
Framework | |
An Embarrassingly Simple Baseline for eXtreme Multi-label Prediction
Title | An Embarrassingly Simple Baseline for eXtreme Multi-label Prediction |
Authors | Yashaswi Verma |
Abstract | The goal of eXtreme Multi-label Learning (XML) is to design and learn a model that can automatically annotate a given data point with the most relevant subset of labels from an extremely large label set. Recently, many techniques have been proposed for XML that achieve reasonable performance on benchmark datasets. Motivated by the complexities of these methods and their subsequent training requirements, in this paper we propose a simple baseline technique for this task. Precisely, we present a global feature embedding technique for XML that can easily scale to very large datasets containing millions of data points in very high-dimensional feature space, irrespective of number of samples and labels. Next we show how an ensemble of such global embeddings can be used to achieve further boost in prediction accuracies with only linear increase in training and prediction time. During testing, we assign the labels using a weighted k-nearest neighbour classifier in the embedding space. Experiments reveal that though conceptually simple, this technique achieves quite competitive results, and has training time of less than one minute using a single CPU core with 15.6 GB RAM even for large-scale datasets such as Amazon-3M. |
Tasks | Multi-Label Learning |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.08140v1 |
https://arxiv.org/pdf/1912.08140v1.pdf | |
PWC | https://paperswithcode.com/paper/an-embarrassingly-simple-baseline-for-extreme |
Repo | |
Framework | |
A Multi-Task Gradient Descent Method for Multi-Label Learning
Title | A Multi-Task Gradient Descent Method for Multi-Label Learning |
Authors | Lu Bai, Yew-Soon Ong, Tiantian He, Abhishek Gupta |
Abstract | Multi-label learning studies the problem where an instance is associated with a set of labels. By treating single-label learning problem as one task, the multi-label learning problem can be casted as solving multiple related tasks simultaneously. In this paper, we propose a novel Multi-task Gradient Descent (MGD) algorithm to solve a group of related tasks simultaneously. In the proposed algorithm, each task minimizes its individual cost function using reformative gradient descent, where the relations among the tasks are facilitated through effectively transferring model parameter values across multiple tasks. Theoretical analysis shows that the proposed algorithm is convergent with a proper transfer mechanism. Compared with the existing approaches, MGD is easy to implement, has less requirement on the training model, can achieve seamless asymmetric transformation such that negative transfer is mitigated, and can benefit from parallel computing when the number of tasks is large. The competitive experimental results on multi-label learning datasets validate the effectiveness of the proposed algorithm. |
Tasks | Multi-Label Learning |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07693v2 |
https://arxiv.org/pdf/1911.07693v2.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-task-gradient-descent-method-for |
Repo | |
Framework | |
Tabulated MLP for Fast Point Feature Embedding
Title | Tabulated MLP for Fast Point Feature Embedding |
Authors | Yusuke Sekikawa, Teppei Suzuki |
Abstract | Aiming at a drastic speedup for point-data embeddings at test time, we propose a new framework that uses a pair of multi-layer perceptron (MLP) and look-up table (LUT) to transform point-coordinate inputs into high-dimensional features. When compared with PointNet’s feature embedding part realized by MLP that requires millions of dot products, ours at test time requires no such layers of matrix-vector products but requires only looking up the nearest entities followed by interpolation, from the tabulated MLP defined over discrete inputs on a 3D lattice. We call this framework as “LUTI-MLP: LUT Interpolation MLP” that provides a way to train end-to-end tabulated MLP coupled to a LUT in a specific manner without the need for any approximation at test time. LUTI-MLP also provides significant speedup for Jacobian computation of the embedding function wrt global pose coordinate on Lie algebra $\mathfrak{se}(3)$ at test time, which could be used for point-set registration problems. After extensive architectural analysis using ModelNet40 dataset, we confirmed that our LUTI-MLP even with a small-sized table ($8\times 8\times 8$) yields performance comparable to that of MLP while achieving significant speedup: $80\times$ for embedding, $12\times$ for approximate Jacobian, and $860\times$ for canonical Jacobian. |
Tasks | |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1912.00790v1 |
https://arxiv.org/pdf/1912.00790v1.pdf | |
PWC | https://paperswithcode.com/paper/tabulated-mlp-for-fast-point-feature |
Repo | |
Framework | |
Attention Forcing for Sequence-to-sequence Model Training
Title | Attention Forcing for Sequence-to-sequence Model Training |
Authors | Qingyun Dou, Yiting Lu, Joshua Efiong, Mark J. F. Gales |
Abstract | Auto-regressive sequence-to-sequence models with attention mechanism have achieved state-of-the-art performance in many tasks such as machine translation and speech synthesis. These models can be difficult to train. The standard approach, teacher forcing, guides a model with reference output history during training. The problem is that the model is unlikely to recover from its mistakes during inference, where the reference output is replaced by generated output. Several approaches deal with this problem, largely by guiding the model with generated output history. To make training stable, these approaches often require a heuristic schedule or an auxiliary classifier. This paper introduces attention forcing, which guides the model with generated output history and reference attention. This approach can train the model to recover from its mistakes, in a stable fashion, without the need for a schedule or a classifier. In addition, it allows the model to generate output sequences aligned with the references, which can be important for cascaded systems like many speech synthesis systems. Experiments on speech synthesis show that attention forcing yields significant performance gain. Experiments on machine translation show that for tasks where various re-orderings of the output are valid, guiding the model with generated output history is challenging, while guiding the model with reference attention is beneficial. |
Tasks | Machine Translation, Speech Synthesis |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12289v2 |
https://arxiv.org/pdf/1909.12289v2.pdf | |
PWC | https://paperswithcode.com/paper/attention-forcing-for-sequence-to-sequence |
Repo | |
Framework | |
Disentangling Speech and Non-Speech Components for Building Robust Acoustic Models from Found Data
Title | Disentangling Speech and Non-Speech Components for Building Robust Acoustic Models from Found Data |
Authors | Nishant Gurunath, Sai Krishna Rallabandi, Alan Black |
Abstract | In order to build language technologies for majority of the languages, it is important to leverage the resources available in public domain on the internet - commonly referred to as `Found Data’. However, such data is characterized by the presence of non-standard, non-trivial variations. For instance, speech resources found on the internet have non-speech content, such as music. Therefore, speech recognition and speech synthesis models need to be robust to such variations. In this work, we present an analysis to show that it is important to disentangle the latent causal factors of variation in the original data to accomplish these tasks. Based on this, we present approaches to disentangle such variations from the data using Latent Stochastic Models. Specifically, we present a method to split the latent prior space into continuous representations of dominant speech modes present in the magnitude spectra of audio signals. We propose a completely unsupervised approach using multinode latent space variational autoencoders (VAE). We show that the constraints on the latent space of a VAE can be in-fact used to separate speech and music, independent of the language of the speech. This paper also analytically presents the requirement on the number of latent variables for the task based on distribution of the speech data. | |
Tasks | Speech Recognition, Speech Synthesis |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11727v1 |
https://arxiv.org/pdf/1909.11727v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-speech-and-non-speech |
Repo | |
Framework | |
User-based collaborative filtering approach for content recommendation in OpenCourseWare platforms
Title | User-based collaborative filtering approach for content recommendation in OpenCourseWare platforms |
Authors | Nikola Tomasevic, Dejan Paunovic, Sanja Vranes |
Abstract | A content recommender system or a recommendation system represents a subclass of information filtering systems which seeks to predict the user preferences, i.e. the content that would be most likely positively “rated” by the user. Nowadays, the recommender systems of OpenCourseWare (OCW) platforms typically generate a list of recommendations in one of two ways, i.e. through the content-based filtering, or user-based collaborative filtering (CF). In this paper, the conceptual idea of the content recommendation module was provided, which is capable of proposing the related decks (presentations, educational material, etc.) to the user having in mind past user activities, preferences, type and content similarity, etc. It particularly analyses suitable techniques for implementation of the user-based CF approach and user-related features that are relevant for the content evaluation. The proposed approach also envisages a hybrid recommendation system as a combination of user-based and content-based approaches in order to provide a holistic and efficient solution for content recommendation. Finally, for evaluation and testing purposes, a designated content recommendation module was implemented as part of the SlideWiki authoring OCW platform. |
Tasks | Recommendation Systems |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10376v1 |
http://arxiv.org/pdf/1902.10376v1.pdf | |
PWC | https://paperswithcode.com/paper/user-based-collaborative-filtering-approach |
Repo | |
Framework | |
Field Label Prediction for Autofill in Web Browsers
Title | Field Label Prediction for Autofill in Web Browsers |
Authors | Joy Bose |
Abstract | Automatic form fill is an important productivity related feature present in major web browsers, which predicts the field labels of a web form and automatically fills values in a new form based on the values previously filled for the same field in other forms. This feature increases the convenience and efficiency of users who have to fill similar information in fields in multiple forms. In this paper we describe a machine learning solution for predicting the form field labels, implemented as a web service using Azure ML Studio. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.08809v1 |
https://arxiv.org/pdf/1912.08809v1.pdf | |
PWC | https://paperswithcode.com/paper/field-label-prediction-for-autofill-in-web |
Repo | |
Framework | |
Boosting Resolution and Recovering Texture of micro-CT Images with Deep Learning
Title | Boosting Resolution and Recovering Texture of micro-CT Images with Deep Learning |
Authors | Ying Da Wang, Ryan T. Armstrong, Peyman Mostaghimi |
Abstract | Digital Rock Imaging is constrained by detector hardware, and a trade-off between the image field of view (FOV) and the image resolution must be made. This can be compensated for with super resolution (SR) techniques that take a wide FOV, low resolution (LR) image, and super resolve a high resolution (HR), high FOV image. The Enhanced Deep Super Resolution Generative Adversarial Network (EDSRGAN) is trained on the Deep Learning Digital Rock Super Resolution Dataset, a diverse compilation 12000 of raw and processed uCT images. The network shows comparable performance of 50% to 70% reduction in relative error over bicubic interpolation. GAN performance in recovering texture shows superior visual similarity compared to SRCNN and other methods. Difference maps indicate that the SRCNN section of the SRGAN network recovers large scale edge (grain boundaries) features while the GAN network regenerates perceptually indistinguishable high frequency texture. Network performance is generalised with augmentation, showing high adaptability to noise and blur. HR images are fed into the network, generating HR-SR images to extrapolate network performance to sub-resolution features present in the HR images themselves. Results show that under-resolution features such as dissolved minerals and thin fractures are regenerated despite the network operating outside of trained specifications. Comparison with Scanning Electron Microscope images shows details are consistent with the underlying geometry of the sample. Recovery of textures benefits the characterisation of digital rocks with a high proportion of under-resolution micro-porous features, such as carbonate and coal samples. Images that are normally constrained by the mineralogy of the rock (coal), by fast transient imaging (waterflooding), or by the energy of the source (microporosity), can be super resolved accurately for further analysis downstream. |
Tasks | Super-Resolution |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.07131v3 |
https://arxiv.org/pdf/1907.07131v3.pdf | |
PWC | https://paperswithcode.com/paper/boosting-resolution-and-recovering-texture-of |
Repo | |
Framework | |
Revenue allocation in Formula One: a pairwise comparison approach
Title | Revenue allocation in Formula One: a pairwise comparison approach |
Authors | Dóra Gréta Petróczy, László Csató |
Abstract | A model is proposed to allocate Formula One World Championship prize money among the constructors. The methodology is based on pairwise comparison matrices, allows for the use of any weighting method, and makes possible to tune the level of inequality. We introduce an axiom called scale invariance, which requires the ranking of the teams to be independent of the parameter controlling inequality. The eigenvector method is revealed to violate this condition in our dataset, while the row geometric mean method always satisfies it. The revenue allocation is not influenced by the arbitrary valuation given to the race prizes in the official points scoring system of Formula One and takes the intensity of pairwise preferences into account, contrary to the standard Condorcet method. Our suggestion can be used to share revenues among groups when group members are ranked several times. |
Tasks | |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.12931v2 |
https://arxiv.org/pdf/1909.12931v2.pdf | |
PWC | https://paperswithcode.com/paper/a-revenue-allocation-scheme-based-on-pairwise |
Repo | |
Framework | |
Automatic Design of Artificial Neural Networks for Gamma-Ray Detection
Title | Automatic Design of Artificial Neural Networks for Gamma-Ray Detection |
Authors | Filipe Assunção, João Correia, Rúben Conceição, Mário Pimenta, Bernardo Tomé, Nuno Lourenço, Penousal Machado |
Abstract | The goal of this work is to investigate the possibility of improving current gamma/hadron discrimination based on their shower patterns recorded on the ground. To this end we propose the use of Convolutional Neural Networks (CNNs) for their ability to distinguish patterns based on automatically designed features. In order to promote the creation of CNNs that properly uncover the hidden patterns in the data, and at same time avoid the burden of hand-crafting the topology and learning hyper-parameters we resort to NeuroEvolution; in particular we use Fast-DENSER++, a variant of Deep Evolutionary Network Structured Representation. The results show that the best CNN generated by Fast-DENSER++ improves by a factor of 2 when compared with the results reported by classic statistical approaches. Additionally, we experiment ensembling the 10 best generated CNNs, one from each of the evolutionary runs; the ensemble leads to an improvement by a factor of 2.3. These results show that it is possible to improve the gamma/hadron discrimination based on CNNs that are automatically generated and are trained with instances of the ground impact patterns. |
Tasks | |
Published | 2019-05-09 |
URL | https://arxiv.org/abs/1905.03532v1 |
https://arxiv.org/pdf/1905.03532v1.pdf | |
PWC | https://paperswithcode.com/paper/190503532 |
Repo | |
Framework | |