Paper Group ANR 185
A Survey of Domain Adaptation for Neural Machine Translation. A Neural Network Aided Approach for LDPC Coded DCO-OFDM with Clipping Distortion. Regularizing Deep Networks by Modeling and Predicting Label Structure. InceptB: A CNN Based Classification Approach for Recognizing Traditional Bengali Games. Hallucinating Agnostic Images to Generalize Acr …
A Survey of Domain Adaptation for Neural Machine Translation
Title | A Survey of Domain Adaptation for Neural Machine Translation |
Authors | Chenhui Chu, Rui Wang |
Abstract | Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT. |
Tasks | Domain Adaptation, Machine Translation |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.00258v1 |
http://arxiv.org/pdf/1806.00258v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-domain-adaptation-for-neural |
Repo | |
Framework | |
A Neural Network Aided Approach for LDPC Coded DCO-OFDM with Clipping Distortion
Title | A Neural Network Aided Approach for LDPC Coded DCO-OFDM with Clipping Distortion |
Authors | Yuan He, Ming Jiang, Chunming Zhao |
Abstract | In this paper, a neural network-aided bit-interleaved coded modulation (NN-BICM) receiver is designed to mitigate the nonlinear clipping distortion in the LDPC coded direct currentbiased optical orthogonal frequency division multiplexing (DCOOFDM) systems. Taking the cross-entropy as loss function, a feed forward network is trained by backpropagation algorithm to output the condition probability through the softmax activation function, thereby assisting in a modified log-likelihood ratio (LLR) improvement. To reduce the complexity, this feed-forward network simplifies the input layer with a single symbol and the corresponding Gaussian variance instead of focusing on the inter-carrier interference between multiple subcarriers. On the basis of the neural network-aided BICM with Gray labelling, we propose a novel stacked network architecture of the bitinterleaved coded modulation with iterative decoding (NN-BICMID). Its performance has been improved further by calculating the condition probability with the aid of a priori probability that derived from the extrinsic LLRs in the LDPC decoder at the last iteration, at the expense of customizing neural network detectors at each iteration time separately. Utilizing the optimal DC bias as the midpoint of the dynamic region, the simulation results demonstrate that both the NN-BICM and NN-BICM-ID schemes achieve noticeable performance gains than other counterparts, in which the NN-BICM-ID clearly outperforms NN-BICM with various modulation and coding schemes. |
Tasks | |
Published | 2018-09-04 |
URL | http://arxiv.org/abs/1809.01022v1 |
http://arxiv.org/pdf/1809.01022v1.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-network-aided-approach-for-ldpc |
Repo | |
Framework | |
Regularizing Deep Networks by Modeling and Predicting Label Structure
Title | Regularizing Deep Networks by Modeling and Predicting Label Structure |
Authors | Mohammadreza Mostajabi, Michael Maire, Gregory Shakhnarovich |
Abstract | We construct custom regularization functions for use in supervised training of deep neural networks. Our technique is applicable when the ground-truth labels themselves exhibit internal structure; we derive a regularizer by learning an autoencoder over the set of annotations. Training thereby becomes a two-phase procedure. The first phase models labels with an autoencoder. The second phase trains the actual network of interest by attaching an auxiliary branch that must predict output via a hidden layer of the autoencoder. After training, we discard this auxiliary branch. We experiment in the context of semantic segmentation, demonstrating this regularization strategy leads to consistent accuracy boosts over baselines, both when training from scratch, or in combination with ImageNet pretraining. Gains are also consistent over different choices of convolutional network architecture. As our regularizer is discarded after training, our method has zero cost at test time; the performance improvements are essentially free. We are simply able to learn better network weights by building an abstract model of the label space, and then training the network to understand this abstraction alongside the original task. |
Tasks | Semantic Segmentation |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1804.02009v1 |
http://arxiv.org/pdf/1804.02009v1.pdf | |
PWC | https://paperswithcode.com/paper/regularizing-deep-networks-by-modeling-and |
Repo | |
Framework | |
InceptB: A CNN Based Classification Approach for Recognizing Traditional Bengali Games
Title | InceptB: A CNN Based Classification Approach for Recognizing Traditional Bengali Games |
Authors | Mohammad Shakirul Islam, Ferdouse Ahmed Foysal, Nafis Neehal, Enamul Karim, Syed Akhter Hossain |
Abstract | Sports activities are an integral part of our day to day life. Introducing autonomous decision making and predictive models to recognize and analyze different sports events and activities has become an emerging trend in computer vision arena. Albeit the advances and vivid applications of artificial intelligence and computer vision in recognizing different popular western games, there remains a very minimal amount of efforts in the application of computer vision in recognizing traditional Bangladeshi games. We, in this paper, have described a novel Deep Learning based approach for recognizing traditional Bengali games. We have retrained the final layer of the renowned Inception V3 architecture developed by Google for our classification approach. Our approach shows promising results with an average accuracy of 80% approximately in correctly recognizing among 5 traditional Bangladeshi sports events. |
Tasks | Decision Making |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01442v2 |
http://arxiv.org/pdf/1805.01442v2.pdf | |
PWC | https://paperswithcode.com/paper/inceptb-a-cnn-based-classification-approach |
Repo | |
Framework | |
Hallucinating Agnostic Images to Generalize Across Domains
Title | Hallucinating Agnostic Images to Generalize Across Domains |
Authors | Fabio M. Carlucci, Paolo Russo, Tatiana Tommasi, Barbara Caputo |
Abstract | The ability to generalize across visual domains is crucial for the robustness of artificial recognition systems. Although many training sources may be available in real contexts, the access to even unlabeled target samples cannot be taken for granted, which makes standard unsupervised domain adaptation methods inapplicable in the wild. In this work we investigate how to exploit multiple sources by hallucinating a deep visual domain composed of images, possibly unrealistic, able to maintain categorical knowledge while discarding specific source styles. The produced agnostic images are the result of a deep architecture that applies pixel adaptation on the original source data guided by two adversarial domain classifier branches at image and feature level. Our approach is conceived to learn only from source data, but it seamlessly extends to the use of unlabeled target samples. Remarkable results for both multi-source domain adaptation and domain generalization support the power of hallucinating agnostic images in this framework. |
Tasks | Domain Adaptation, Domain Generalization, Unsupervised Domain Adaptation |
Published | 2018-08-03 |
URL | https://arxiv.org/abs/1808.01102v2 |
https://arxiv.org/pdf/1808.01102v2.pdf | |
PWC | https://paperswithcode.com/paper/agnostic-domain-generalization |
Repo | |
Framework | |
Neural Dynamic Programming for Musical Self Similarity
Title | Neural Dynamic Programming for Musical Self Similarity |
Authors | Christian J. Walder, Dongwoo Kim |
Abstract | We present a neural sequence model designed specifically for symbolic music. The model is based on a learned edit distance mechanism which generalises a classic recursion from computer sci- ence, leading to a neural dynamic program. Re- peated motifs are detected by learning the transfor- mations between them. We represent the arising computational dependencies using a novel data structure, the edit tree; this perspective suggests natural approximations which afford the scaling up of our otherwise cubic time algorithm. We demonstrate our model on real and synthetic data; in all cases it out-performs a strong stacked long short-term memory benchmark. |
Tasks | |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.03144v3 |
http://arxiv.org/pdf/1802.03144v3.pdf | |
PWC | https://paperswithcode.com/paper/neural-dynamic-programming-for-musical-self |
Repo | |
Framework | |
Gold Seeker: Information Gain from Policy Distributions for Goal-oriented Vision-and-Langauge Reasoning
Title | Gold Seeker: Information Gain from Policy Distributions for Goal-oriented Vision-and-Langauge Reasoning |
Authors | Ehsan Abbasnejad, Iman Abbasnejad, Qi Wu, Javen Shi, Anton van den Hengel |
Abstract | As Computer Vision moves from a passive analysis of pixels to active analysis of semantics, the breadth of information algorithms need to reason over has expanded significantly. One of the key challenges in this vein is the ability to identify the information required to make a decision, and select an action that will recover it. We propose a reinforcement-learning approach that maintains a distribution over its internal information, thus explicitly representing the ambiguity in what it knows, and needs to know, towards achieving its goal. Potential actions are then generated according to this distribution. For each potential action a distribution of the expected outcomes is calculated, and the value of the potential information gain assessed. The action taken is that which maximizes the potential information gain. We demonstrate this approach applied to two vision-and-language problems that have attracted significant recent interest, visual dialog and visual query generation. In both cases, the method actively selects actions that will best reduce its internal uncertainty and outperforms its competitors in achieving the goal of the challenge. |
Tasks | Visual Dialog |
Published | 2018-12-16 |
URL | https://arxiv.org/abs/1812.06398v3 |
https://arxiv.org/pdf/1812.06398v3.pdf | |
PWC | https://paperswithcode.com/paper/an-active-information-seeking-model-for-goal |
Repo | |
Framework | |
Non-asymptotic bounds for percentiles of independent non-identical random variables
Title | Non-asymptotic bounds for percentiles of independent non-identical random variables |
Authors | Dong Xia |
Abstract | This note displays an interesting phenomenon for percentiles of independent but non-identical random variables. Let $X_1,\cdots,X_n$ be independent random variables obeying non-identical continuous distributions and $X^{(1)}\geq \cdots\geq X^{(n)}$ be the corresponding order statistics. For any $p\in(0,1)$, we investigate the $100(1-p)$%-th percentile $X^{(pn)}$ and prove non-asymptotic bounds for $X^{(pn)}$. In particular, for a wide class of distributions, we discover an intriguing connection between their median and the harmonic mean of the associated standard deviations. For example, if $X_k\sim\mathcal{N}(0,\sigma_k^2)$ for $k=1,\cdots,n$ and $p=\frac{1}{2}$, we show that its median $\big{\rm Med}\big(X_1,\cdots,X_n\big)\big= O_P\Big(n^{1/2}\cdot\big(\sum_{k=1}^n\sigma_k^{-1}\big)^{-1}\Big)$ as long as ${\sigma_k}_{k=1}^n$ satisfy certain mild non-dispersion property. |
Tasks | |
Published | 2018-08-24 |
URL | http://arxiv.org/abs/1808.07997v2 |
http://arxiv.org/pdf/1808.07997v2.pdf | |
PWC | https://paperswithcode.com/paper/non-asymptotic-bounds-for-percentiles-of |
Repo | |
Framework | |
Faster Support Vector Machines
Title | Faster Support Vector Machines |
Authors | Sebastian Schlag, Matthias Schmitt, Christian Schulz |
Abstract | The time complexity of support vector machines (SVMs) prohibits training on huge data sets with millions of data points. Recently, multilevel approaches to train SVMs have been developed to allow for time-efficient training on huge data sets. While regular SVMs perform the entire training in one – time consuming – optimization step, multilevel SVMs first build a hierarchy of problems decreasing in size that resemble the original problem and then train an SVM model for each hierarchy level, benefiting from the solved models of previous levels. We present a faster multilevel support vector machine that uses a label propagation algorithm to construct the problem hierarchy. Extensive experiments indicate that our approach is up to orders of magnitude faster than the previous fastest algorithm while having comparable classification quality. For example, already one of our sequential solvers is on average a factor 15 faster than the parallel ThunderSVM algorithm, while having similar classification quality. |
Tasks | |
Published | 2018-08-20 |
URL | https://arxiv.org/abs/1808.06394v3 |
https://arxiv.org/pdf/1808.06394v3.pdf | |
PWC | https://paperswithcode.com/paper/faster-support-vector-machines |
Repo | |
Framework | |
Analysis of Bag-of-n-grams Representation’s Properties Based on Textual Reconstruction
Title | Analysis of Bag-of-n-grams Representation’s Properties Based on Textual Reconstruction |
Authors | Qi Huang, Zhanghao Chen, Zijie Lu, Yuan Ye |
Abstract | Despite its simplicity, bag-of-n-grams sen- tence representation has been found to excel in some NLP tasks. However, it has not re- ceived much attention in recent years and fur- ther analysis on its properties is necessary. We propose a framework to investigate the amount and type of information captured in a general- purposed bag-of-n-grams sentence represen- tation. We first use sentence reconstruction as a tool to obtain bag-of-n-grams representa- tion that contains general information of the sentence. We then run prediction tasks (sen- tence length, word content, phrase content and word order) using the obtained representation to look into the specific type of information captured in the representation. Our analysis demonstrates that bag-of-n-grams representa- tion does contain sentence structure level in- formation. However, incorporating n-grams with higher order n empirically helps little with encoding more information in general, except for phrase content information. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06502v1 |
http://arxiv.org/pdf/1809.06502v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-bag-of-n-grams-representations |
Repo | |
Framework | |
Neural Network Detection of Data Sequences in Communication Systems
Title | Neural Network Detection of Data Sequences in Communication Systems |
Authors | Nariman Farsad, Andrea Goldsmith |
Abstract | We consider detection based on deep learning, and show it is possible to train detectors that perform well without any knowledge of the underlying channel models. Moreover, when the channel model is known, we demonstrate that it is possible to train detectors that do not require channel state information (CSI). In particular, a technique we call a sliding bidirectional recurrent neural network (SBRNN) is proposed for detection where, after training, the detector estimates the data in real-time as the signal stream arrives at the receiver. We evaluate this algorithm, as well as other neural network (NN) architectures, using the Poisson channel model, which is applicable to both optical and molecular communication systems. In addition, we also evaluate the performance of this detection method applied to data sent over a molecular communication platform, where the channel model is difficult to model analytically. We show that SBRNN is computationally efficient, and can perform detection under various channel conditions without knowing the underlying channel model. We also demonstrate that the bit error rate (BER) performance of the proposed SBRNN detector is better than that of a Viterbi detector with imperfect CSI as well as that of other NN detectors that have been previously proposed. Finally, we show that the SBRNN can perform well in rapidly changing channels, where the coherence time is on the order of a single symbol duration. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1802.02046v3 |
http://arxiv.org/pdf/1802.02046v3.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-detection-of-data-sequences-in |
Repo | |
Framework | |
Are All Experts Equally Good? A Study of Analyst Earnings Estimates
Title | Are All Experts Equally Good? A Study of Analyst Earnings Estimates |
Authors | Amir Ban, Yishay Mansour |
Abstract | We investigate whether experts possess differential expertise when making predictions. We note that this would make it possible to aggregate multiple predictions into a result that is more accurate than their consensus average, and that the improvement prospects grow with the amount of differentiation. Turning this argument on its head, we show how differentiation can be measured by how much weighted aggregation improves on simple averaging. Taking stock-market analysts as experts in their domain, we do a retrospective study using historical quarterly earnings forecasts and actual results for large publicly traded companies. We use it to shed new light on the Sinha et al. (1997) result, showing that analysts indeed possess individual expertise, but that their differentiation is modest. On the other hand, they have significant individual bias. Together, these enable a 20%-30% accuracy improvement over consensus average. |
Tasks | |
Published | 2018-05-13 |
URL | http://arxiv.org/abs/1806.06654v1 |
http://arxiv.org/pdf/1806.06654v1.pdf | |
PWC | https://paperswithcode.com/paper/are-all-experts-equally-good-a-study-of |
Repo | |
Framework | |
Multi-criteria Evolution of Neural Network Topologies: Balancing Experience and Performance in Autonomous Systems
Title | Multi-criteria Evolution of Neural Network Topologies: Balancing Experience and Performance in Autonomous Systems |
Authors | Sharat Chidambaran, Amir Behjat, Souma Chowdhury |
Abstract | Majority of Artificial Neural Network (ANN) implementations in autonomous systems use a fixed/user-prescribed network topology, leading to sub-optimal performance and low portability. The existing neuro-evolution of augmenting topology or NEAT paradigm offers a powerful alternative by allowing the network topology and the connection weights to be simultaneously optimized through an evolutionary process. However, most NEAT implementations allow the consideration of only a single objective. There also persists the question of how to tractably introduce topological diversification that mitigates overfitting to training scenarios. To address these gaps, this paper develops a multi-objective neuro-evolution algorithm. While adopting the basic elements of NEAT, important modifications are made to the selection, speciation, and mutation processes. With the backdrop of small-robot path-planning applications, an experience-gain criterion is derived to encapsulate the amount of diverse local environment encountered by the system. This criterion facilitates the evolution of genes that support exploration, thereby seeking to generalize from a smaller set of mission scenarios than possible with performance maximization alone. The effectiveness of the single-objective (optimizing performance) and the multi-objective (optimizing performance and experience-gain) neuro-evolution approaches are evaluated on two different small-robot cases, with ANNs obtained by the multi-objective optimization observed to provide superior performance in unseen scenarios. |
Tasks | |
Published | 2018-07-20 |
URL | http://arxiv.org/abs/1807.07979v1 |
http://arxiv.org/pdf/1807.07979v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-criteria-evolution-of-neural-network |
Repo | |
Framework | |
Gaussian and exponential lateral connectivity on distributed spiking neural network simulation
Title | Gaussian and exponential lateral connectivity on distributed spiking neural network simulation |
Authors | Elena Pastorelli, Pier Stanislao Paolucci, Francesco Simula, Andrea Biagioni, Fabrizio Capuani, Paolo Cretaro, Giulia De Bonis, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Luca Pontisso, Piero Vicini, Roberto Ammendola |
Abstract | We measured the impact of long-range exponentially decaying intra-areal lateral connectivity on the scaling and memory occupation of a distributed spiking neural network simulator compared to that of short-range Gaussian decays. While previous studies adopted short-range connectivity, recent experimental neurosciences studies are pointing out the role of longer-range intra-areal connectivity with implications on neural simulation platforms. Two-dimensional grids of cortical columns composed by up to 11 M point-like spiking neurons with spike frequency adaption were connected by up to 30 G synapses using short- and long-range connectivity models. The MPI processes composing the distributed simulator were run on up to 1024 hardware cores, hosted on a 64 nodes server platform. The hardware platform was a cluster of IBM NX360 M5 16-core compute nodes, each one containing two Intel Xeon Haswell 8-core E5-2630 v3 processors, with a clock of 2.40 G Hz, interconnected through an InfiniBand network, equipped with 4x QDR switches. |
Tasks | |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08833v2 |
http://arxiv.org/pdf/1803.08833v2.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-and-exponential-lateral-connectivity |
Repo | |
Framework | |
Deep Learning Based Natural Language Processing for End to End Speech Translation
Title | Deep Learning Based Natural Language Processing for End to End Speech Translation |
Authors | Sarvesh Patil |
Abstract | Deep Learning methods employ multiple processing layers to learn hierarchial representations of data. They have already been deployed in a humongous number of applications and have produced state-of-the-art results. Recently with the growth in processing power of computers to be able to do high dimensional tensor calculations, Natural Language Processing (NLP) applications have been given a significant boost in terms of efficiency as well as accuracy. In this paper, we will take a look at various signal processing techniques and then application of them to produce a speech-to-text system using Deep Recurrent Neural Networks. |
Tasks | |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.04459v1 |
http://arxiv.org/pdf/1808.04459v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-natural-language |
Repo | |
Framework | |