This R tutorial we will analyze data from concrete with eight features describing the components used in the mixture using artificial neural networks. These eight features include cement, slag, ash. water, superplastic, coarseagg, fineagg, age and strength.

Since we will be using the concrete data set, you will need to download this data set. This dataset is already packaged and available for an easy download from the dataset page. or directly from here Concrete – concrete.csv

## Import Libraries and Load Concrete Dataset

There will be multiple libraries we need to install and load in this R tutorial before we load the data.

**Input:**

1 2 3 4 5 6 7 8 9 10 11 12 |
install.packages("C50") install.packages("gmodels") install.packages("party") install.packages("car") install.packages("ggplot2") install.packages("neuralnet") library(C50) library(gmodels) library(party) library(car) library(ggplot2) library(neuralnet) |

## Download and Load the Concrete Dataset

Since we will be using the concrete dataset, you will need to download this dataset. This dataset is already packaged and available for an easy download from the dataset page.

**Input:**

1 2 |
concrete <- read.csv("concrete.csv") str(concrete) |

**Output:**

1 2 3 4 5 6 7 8 9 10 |
'data.frame': 1030 obs. of 9 variables: $ cement : num 141 169 250 266 155 ... $ slag : num 212 42.2 0 114 183.4 ... $ ash : num 0 124.3 95.7 0 0 ... $ water : num 204 158 187 228 193 ... $ superplastic: num 0 10.8 5.5 0 9.1 0 0 6.4 0 9 ... $ coarseagg : num 972 1081 957 932 1047 ... $ fineagg : num 748 796 861 670 697 ... $ age : int 28 14 28 28 28 90 7 56 28 28 ... $ strength : num 29.9 23.5 29.2 45.9 18.3 ... |

## What is Artifical Neural Network?

Artificial Neural Network models create a relationship between a set of input signals and output signals using a model derived from an understanding of how a biological brain responds to stimuli from sensory inputs. ANN uses a network of artificial neurons or nodes to solve learning problems as such with the concrete data.

ANN’s model can also be used in the below reality problems:

- Speech and handwriting recognition programs like those used by voicemail transcription
- Automation of smart devices like an office building’s environmental controls or self-driving vehicles
- Weather models and climate patterns, the strength of tensile and other scientific, social and economics

### Artificial Neural Network Layers

Layers are arranged in groups as input and output nodes. This is because the input nodes process the incoming data exactly as it’s received, the network has only one set of connection weights. This is known as a single-layer network that can be used for basic pattern classification, particularly patterns that are linearly separable. Layering can be very complex as one adds additional layers. A multilayer network adds one or more hidden layers that process the signals from the input nodes, prior to reaching the output node.

### Artificial Neural Network Backpropagation

Backpropagation is a strategy of back-propagating errors. This model has a few strengths such as; adaptive to classification or numerical prediction problems, the capability of modeling more complex patterns than nearly any algorithm and can make a few assumptions about the data’s underlying relationships.

We must first normalize the data for concrete. Normalizing will take a vector x of numeric values, and for each value in x, subtracts the minimum value of x and then divides by the range of values of x.

**Input:**

1 2 |
normalize <- function(x) { return((x - min(x)) / (max(x) - min (x))) |

If the above sounds confusing, let’s break the normalize function down in R.

1 2 |
normalize(c(1,2,3,4,5,6)) normalize(c(10,20,30,40,50,60)) |

Now that we have a clear understanding of normalization, we can now use the **lapply()** function. This function will take a list and applies a specified function to each list element. This function will be used with the function **normalize()**.

**Input:**

1 2 |
concrete_norm <- as.data.frame(lapply(concrete, normalize)) summary(concrete_norm$strength) |

**Output:**

1 2 |
Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0000 0.2664 0.4001 0.4172 0.5457 1.0000 |

**Input:**

1 |
summary(concrete$strength) |

**Output:**

1 2 |
Min. 1st Qu. Median Mean 3rd Qu. Max. 2.33 23.71 34.45 35.82 46.13 82.60 |

We will now use 25 percent of the data for testing and 75 percent for the training dataset. This will be divided into two separate datasets; **training and testing**.

**Input:**

1 2 |
concrete_train <- concrete_norm[1:773, ] concrete_test <- concrete_norm[774:1030, ] |

## Artificial Neural Network Method: neuralnet()

The method that will be used with the concrete data is **neuralnet()**. This will be used to model the ingredients used in concrete and the strength of the finished product. This will be a multilayer feedforward neural network. Because of the data, **neuralnet()** is a very strong choice for learning more about the neural networks.

**Input:**

1 2 3 |
concrete_model <- neuralnet(strength ~ cement + slag + ash + water +superplastic + coarseagg + fineagg + age, data = concrete_train) plot(concrete_model) |

**Output:**

## Artificial Neural Network neuralnet() Model Performance

The network topology diagram gives us a look into the black box of the ANN. However, the downside is that that we do not have much information about how well the model will fit the future data.

### Artificial Neural Network compute() Function

This function will return a list of two components; neurons, which stores the neurons for each layer in the network, and net.result, which stores the predicted values. Using the head() function will provide a sample of the output, instead of printing all values

**Input:**

1 2 3 |
model_results <- compute(concrete_model, concrete_test[1:8]) head(model_results$net.result) |

**Output:**

1 2 3 4 5 6 7 |
[,1] 774 0.3271884369 775 0.4652341650 776 0.2384002467 777 0.6730106686 778 0.4587699296 779 0.4721687651 |

**Input:**

1 2 3 |
predicted_strength <- model_results$net.result head(predicted_strength) |

**Output:**

1 2 3 4 5 6 7 |
[,1] 774 0.3271884369 775 0.4652341650 776 0.2384002467 777 0.6730106686 778 0.4587699296 779 0.4721687651 |

**Input:**

1 |
cor(predicted_strength, concrete_test$strength) |

**Output:**

1 2 |
[,1] [1,] 0.8059068685 |

## Improve the ANN neuralnet() Performance Model

**Input:**

1 2 3 |
concrete_model2 <- neuralnet(strength ~ cement + slag + ash + water + superplastic + coarseagg + fineagg +age, data = concrete_train, hidden = 5) plot(concrete_model2) |

**Output:**

**Input:**

1 2 3 4 5 |
model_results2 <- compute(concrete_model2, concrete_test[1:8]) predicted_strength2 <- model_results2$net.result cor(predicted_strength2,concrete_test$strength) |

**Output:**

1 2 |
[,1] [1,] 0.9298046076 |

## Additional Plots for Concrete Strength

### Density Plot for Concrete Strength

**Input:**

1 |
qplot(strength, data = concrete, geom = "density", fill = "color") |

**Output:**

### Scatterplot for Concrete Strength and Age

**Input:**

1 2 3 |
ggplot(concrete, aes(x=strength, y=cement, color=age)) + geom_point() + geom_smooth() |

**Output:**