Working directories
1 | cd /media/ht/ht_5T/Work/Projects/HIV/Figures/凉山 # 凉山地图下载 |
记录的原则是
- 可以是手写记录,但一定是在iPad Pro上。
- 其他的尽可能是文本记录。
Error calculation in R/CRAN
1 | i='/media/ht/ht_5T/Work/Funding/2017_重大专项/Management/Administration/05_00_HFMD/Publication/Raw_Data'; cd ${i} |
- MAE, MSE, RMSE, Coefficient of Determination, Adjusted R Squared — Which Metric is Better? | by Akshita Chugh | Analytics Vidhya | Medium
- Measures of Accuracy function - RDocumentation
- DataTechNotes: Regression Model Accuracy (MAE, MSE, RMSE, R-squared) Check in R
- Calculate RMSE and MAE in R and SAS | R-bloggers
MAE, MSE, RMSE, Coefficient of Determination, Adjusted R Squared — Which Metric is Better?
Follow
Dec 8, 2020 · 4 min read
The objective of Linear Regression is to find a line that minimizes the prediction error of all the data points.
The essential step in any machine learning model is to evaluate the accuracy of the model. The Mean Squared Error, Mean absolute error, Root Mean Squared Error, and R-Squared or Coefficient of determination metrics are used to evaluate the performance of the model in regression analysis.
- The Mean absolute error represents the average of the absolute difference between the actual and predicted values in the dataset. It measures the average of the residuals in the dataset.
- Mean Squared Error represents the average of the squared difference between the original and predicted values in the data set. It measures the variance of the residuals.
- Root Mean Squared Error is the square root of Mean Squared error. It measures the standard deviation of residuals.
- The coefficient of determination or R-squared represents the proportion of the variance in the dependent variable which is explained by the linear regression model. It is a scale-free score i.e. irrespective of the values being small or large, the value of R square will be less than one.
- Adjusted R squared is a modified version of R square, and it is adjusted for the number of independent variables in the model, and it will always be less than or equal to R².In the formula below n is the number of observations in the data and k is the number of the independent variables in the data.
Differences among these evaluation metrics
- Mean Squared Error(MSE) and Root Mean Square Error penalizes the large prediction errors vi-a-vis Mean Absolute Error (MAE). However, RMSE is widely used than MSE to evaluate the performance of the regression model with other random models as it has the same units as the dependent variable (Y-axis).
- MSE is a differentiable function that makes it easy to perform mathematical operations in comparison to a non-differentiable function like MAE. Therefore, in many models, RMSE is used as a default metric for calculating Loss Function despite being harder to interpret than MAE.
- MAE is more robust to data with outliers.
- The lower value of MAE, MSE, and RMSE implies higher accuracy of a regression model. However, a higher value of R square is considered desirable.
- R Squared & Adjusted R Squared are used for explaining how well the independent variables in the linear regression model explains the variability in the dependent variable. R Squared value always increases with the addition of the independent variables which might lead to the addition of the redundant variables in our model. However, the adjusted R-squared solves this problem.
- Adjusted R squared takes into account the number of predictor variables, and it is used to determine the number of independent variables in our model. The value of Adjusted R squared decreases if the increase in the R square by the additional variable isn’t significant enough.
- For comparing the accuracy among different linear regression models, RMSE is a better choice than R Squared.
Conclusion
Therefore, if comparing the prediction accuracy among different linear regression (LR)models then RMSE is a better option as it is simple to calculate and differentiable. However, if your dataset has outliers then choose MAE over RMSE.
Besides, the number of predictor variables in a linear regression model is determined by adjusted R squared, and choose RMSE over adjusted R squared if you care about evaluating prediction accuracy among different LR models.
1 | original = c( -2, 1, -3, 2, 3, 5, 4, 6, 5, 6, 7) |