Extreme learning machine

Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden nodes (not just the weights connecting inputs to hidden nodes) need not be tuned. These hidden nodes can be randomly assigned and never updated (i.e. they are random projection but with nonlinear transforms), or can be inherited from their ancestors without being changed. In most cases, the output weights of hidden nodes are usually learned in a single step, which essentially amounts to learning a linear model. The name 'extreme learning machine' (ELM) was given to such models by its main inventor Guang-Bin Huang. Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden nodes (not just the weights connecting inputs to hidden nodes) need not be tuned. These hidden nodes can be randomly assigned and never updated (i.e. they are random projection but with nonlinear transforms), or can be inherited from their ancestors without being changed. In most cases, the output weights of hidden nodes are usually learned in a single step, which essentially amounts to learning a linear model. The name 'extreme learning machine' (ELM) was given to such models by its main inventor Guang-Bin Huang. According to their creators, these models are able to produce good generalization performance and learn thousands of times faster than networks trained using backpropagation. In literature, it also shows that these models can outperform support vector machines (SVM) and SVM provides suboptimal solutions in both classification and regression applications. From 2001-2010, ELM research mainly focused on the unified learning framework for 'generalized' single-hidden layer feedforward neural networks (SLFNs), including but not limited to sigmoid networks, RBF networks, threshold networks, trigonometric networks, fuzzy inference systems, Fourier series, Laplacian transform, wavelet networks, etc. One significant achievement made in those years is to successfully prove the universal approximation and classification capabilities of ELM in theory. From 2010 to 2015, ELM research extended to the unified learning framework for kernel learning, SVM and a few typical feature learning methods such as Principal Component Analysis (PCA) and Non-negative Matrix Factorization (NMF). It is shown that SVM actually provides suboptimal solutions compared to ELM, and ELM can provide the whitebox kernel mapping, which is implemented by ELM random feature mapping, instead of the blackbox kernel used in SVM. PCA and NMF can be considered as special cases where linear hidden nodes are used in ELM. From 2015 onwards, an increased focus has been placed on hierarchical implementations of ELM. Additionally since 2011, significant biological studies have been made that support certain ELM theories. In a recent announcement from Google Scholar: 'Classic Papers: Articles That Have Stood The Test of Time', two ELM papers have been listed in the 'Top 10 in Artificial Intelligence for 2006,' taking positions 2 and 7. Given a single hidden layer of ELM, suppose that the output function of the i {displaystyle i} -th hidden node is h i ( x ) = G ( a i , b i , x ) {displaystyle h_{i}(mathbf {x} )=G(mathbf {a} _{i},b_{i},mathbf {x} )} , where a i {displaystyle mathbf {a} _{i}} and b i {displaystyle b_{i}} are the parameters of the i {displaystyle i} -th hidden node. The output function of the ELM for SLFNs with L {displaystyle L} hidden nodes is: f L ( x ) = ∑ i = 1 L β i h i ( x ) {displaystyle f_{L}({f {x}})=sum _{i=1}^{L}{oldsymbol {eta }}_{i}h_{i}({f {x}})} , where β i {displaystyle {oldsymbol {eta }}_{i}} is the output weight of the i {displaystyle i} -th hidden node. h ( x ) = [ G ( h i ( x ) , . . . , h L ( x ) ) ] {displaystyle mathbf {h} (mathbf {x} )=} is the hidden layer output mapping of ELM. Given N {displaystyle N} training samples, the hidden layer output matrix H {displaystyle mathbf {H} } of ELM is given as: H = [ h ( x 1 ) ⋮ h ( x N ) ] = [ G ( a 1 , b 1 , x 1 ) ⋯ G ( a L , b L , x 1 ) ⋮ ⋮ ⋮ G ( a 1 , b 1 , x N ) ⋯ G ( a L , b L , x N ) ] {displaystyle {f {H}}=left=left}

Parent Topic

Child Topic

No Parent Topic