Let’s perform a model comparison with some of the other sklearn estimators. Since the PCE regression class is of the same type, we can feed it directly Model comparison ================ Setup ~~~~~ .. code:: ipython3 # First, import the libraries we will use import tesuract import matplotlib.pyplot as plt .. code:: ipython3 # import a data set for our regression problem from sklearn.datasets import make_friedman1 X,y = make_friedman1(n_samples=100,n_features=5) .. code:: ipython3 # rescale the input X = 2*X - 1 # center and scale the output as well for good measure (not required) y = (y - y.mean())/np.sqrt(np.var(y)) Cross-validation score ---------------------- .. code:: ipython3 # compute the cross validation score (5-fold by default) # of the pce we constructed earlier, i.e. 8th order linear fit from sklearn.model_selection import cross_val_score pce = tesuract.PCEReg(order=8) pce_score = cross_val_score(pce,X,y,scoring='r2').mean() print("PCE score is {0:.3f}".format(pce_score)) .. parsed-literal:: PCE score is 0.836 Not bad for a first pass. How does it compare to something like random forests regression or MLPs? Now, we can compare apples to applies within the same environment since these models are all part of the sklearn base-estimator class. .. code:: ipython3 # Let's try a simple random forest estimator from sklearn.ensemble import RandomForestRegressor rfregr = RandomForestRegressor(max_depth=5,n_estimators=100) rf_score = cross_val_score(rfregr,X,y,scoring='r2').mean() print("RF score is {0:.3f}".format(rf_score)) .. parsed-literal:: RF score is 0.685 .. code:: ipython3 # Let's try a simple 4-layer neural network (fully connected) from sklearn.neural_network import MLPRegressor mlpregr = MLPRegressor(hidden_layer_sizes=(100,100,100,100)) mlp_score = cross_val_score(mlpregr,X,y,scoring='r2').mean() print("MLP score is {0:.3f}".format(mlp_score)) .. parsed-literal:: MLP score is 0.939 Wow! So the MLP way out-performed the 8th order polynomial with a linear fit! But wait. What if we tried a different polynomial order? Or a different fitting procedure like a sparse l-1 solver? Can we leverage the hyper-parameter tuning that sklearn has? Yes! Moreso, we created an easy wrapper for the grid search cv functionality and a new pce regression wrapper that has cross-validation and hyper-parameter tuning built in!