How to Add New R packages to your SQL Server R Server
One of the newest additions to the Microsoft’s SQL Server stack, version 2016 or later, is the SQL Server R Server. R Server installs Microsoft’s RevoScaleR software, providing an interface from the SQL Server database engine to the R server. This allows commands in T-SQL to process R scripts and, most importantly, to pass data back and forth between servers. It’s a great advancement, combining the statistical and machine learning capabilities of R with the advanced data management tools of SQL Server, adding in a little parallelism to boot.
Unfortunately, the R ecosystem is so vast, it really isn’t practical to install the whole thing; which means you will probably run into an occasion where the default install falls short and you need to add additional packages to your R Server. Whereas adding standard R packages to a traditional R installation requires simply running the install.packages command in something like RStudio; for SQL Server, there are a few extra steps. In this example, we will see how to add packages to a SQL Server using a few tools and administrator privileges.
Configuring your Server
If this is your first time running R scripts in SQL Server, you will need to make sure your server is configured to execute external scripts. If you can already run scripts, you can skip to the next section
Launch SQL Server Management Studio (as administrator) and run the following code:
EXEC sp_configure 'show advanced options', 1 GO
This will allow you access to the advanced configuration properties so you can enable running external scripts such as R and (new to SQL Server 2017) Python. Next, set the configuration bit by running the following code:
sp_configure 'external_scripts_enabled', 1 GO RECONFIGURE;
If everything is OK, you should see the following message:
“Configuration option 'external scripts enabled' changed from 0 to 1. Run the RECONFIGURE statement to install.”
At this point you should restart the SQL Server for your changes to take effect. If you forget to restart you will get this nasty message when you try to run anything:
Msg 39023, Level 16, State 1, Procedure sp_execute_external_script, Line 1 [Batch Start Line 0]
You can see what packages you have installed by running the following command:
EXECUTE sp_execute_external_script @language=N'R' , @script = N'str(OutputDataSet); packagematrix <- installed.packages(); NameOnly <- packagematrix[,1]; OutputDataSet <- as.data.frame(NameOnly);' ,@input_data_1 = N'SELECT 1 as col' WITH RESULT SETS ((PackageName nvarchar(250) )); GO
This will generate a list of packages installed in your library, something like this:
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\library
Adding a Package
Suppose we want to do some string manipulation in R using the stringr package. By default, you won’t have this package available in SQL Server. You can test this by running a simple script such as this one:
EXEC sp_execute_external_script @language = N'R', @script = N'library(stringr)'; GO
SQL Server will dutifully run this script and complain:
To fix this will require the following steps:
The Comprehensive R Network (CRAN) provides a location where you can download most any R package. You can find and select your package here:
Choose the package for your operating system. In this example, we will select the Windows .zip release stringr_1.2.0.zip.
Copy the zip file to your SQL Server R library.
Install the Package
Launch the RGui as administrator. You will need administrator privileges in order to write to the SQL Server library. To install the package, you need to run the install.packages command and specify the library as the destination. Declare the SQL Library as a variable by running the following R command in the RGui:
lib.SQL <- "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\R_SERVICES\\library"
Then install the stringr package:
install.packages("stringr", lib = lib.SQL)
Or more generally:
install.packages(<package name>, lib =<library path>)
Your session might look like this:
That’s it. You can test whether the install took by running the following command in SSMS again:
EXECUTE sp_execute_external_script @language=N'R', @script = N'library(stringr);' GO
If you are missing any dependent packages, SQL Server will throw an error to that effect. To fix this, simply repeat the steps above for any dependent libraries until you have all the dependencies you need. You can try the install using the dependencies flag to save some time if that is a possibility.
The combination of R and SQL Server opens up a whole new range of possibilities for data analysis and is likely to prove an invaluable tool in any data analyst’s toolbox. Fortunately, it is also a straightforward process to extend the existing installation to perform a wide range of analytical functions not available right out of the box.
Happy trails and keep coding!