Populating Excel workbooks and ranges with arrays
Microsoft Excel, with its almost endless supply of rows and columns, is data DisneyLand. It’s fun to work with data and make it easy to consume. Part of the fun, at least for developers, is populating Excel with data. Today, I want to take another look at how to use arrays to populate Excel with data.
- The scenario
- The basic strategy to work with Excel arrays
- How to prepare your arrays for Excel ranges
- Using Value versus Value2 versus Formula
- Fill array with a query and insert into a worksheet
- Fill the array with a range and insert into another worksheet
- The fastest way – generate Excel files directly
The scenario
You have an array filled with data and you want to insert that data into an Excel spreadsheet. You would be considered quite reasonable if you attempted to create an array and insert it. Something like this:
Dim myArray as Object () = GetMyData Excel.Range(myRange).Value = MyArray
This strategy will insert data but it will not make you happy. What you will see is each cell within the range receives the same data… the first field in the array.
We want to populate Excel with data as efficiently as possible. This means we are not going to navigate a worksheet row-by-row, column-by-column to insert data. That would be what you call “slower than a West Texas tumbleweed“. What we need is a strategy that allows us to insert the data in a single bound… or one fell swoop… or lickety split.
The basic strategy to work with Excel arrays
I touched on the strategy two articles ago in Working with Excel tables and ranges. If you charge ahead and try to do this on your own, you might be tempted to create a 1D array and insert it into a cell of your choosing. Don’t do this. It will hurt you because it won’t work. Instead, heed this wisdom and create at least a 2D array. For the convenience’s sake, I’ve include my code sample from two articles ago (I measure time not in minutes, but in articles).
Private Sub ArrayByRandomLoop(targetRange As Excel.Range) 'We'll use a dynamically silly array (or is it silly-ly dynamic?) 'But it is easy to imagine creating your own array dynamically Dim dataToInsert As Object(,) = New Object(5, 1) {} For i As Integer = 0 To 5 dataToInsert(i, 0) = "Row " + i.ToString() Next targetRange.Value2 = dataToInsert End Sub
This sample is ideal for benign scenarios (i.e. somewhat lame code samples) that 1) are not dynamic and 2) require only a single column of data. Let’s turn up the volume just a little.
How to prepare your arrays for Excel ranges
Arrays are not created equally. They vary in shape, size, and purpose. Some are completely filled with data; others are less filled and/or empty. When inserting into Excel, we want to work an arrays that have every element filled. Thus, before we insert an array into a spreadsheet, we should prepare them.
Empty arrays
You can check if an array is completely empty by checking each element for Nothing. If each one is Nothing, we know we have an empty array.
Private Function CheckIsEmptyArray(theArray As System.Array) As Boolean For i As Integer = 0 To theArray.GetUpperBound(0) For j As Integer = 0 To theArray.GetUpperBound(1) If theArray.GetValue(i, j) IsNot Nothing Then Return False End If Next Next Return True End Function
If so much as one element within the array is not Nothing, we know the array has at least a little value.
Empty values
To find empty values, we do the same as with looking for empty arrays. We look at each element to determine if its value is empty (Nothing). But here there is a twist, can you spot it?
Private Sub InsertValuesWhereEmpty(theArray As System.Array) For iRow As Integer = 0 To theArray.GetUpperBound(0) 'The number of rows For iColumn As Integer = 0 To theArray.GetUpperBound(1) 'The number of columns Dim val As Object = theArray(iRow, iColumn) If val Is Nothing Then theArray(iRow, iColumn) = CVErr(ADXxlCVError.xlErrNull) End If Next Next End Sub Public Function CVErr(ByVal xlError As ADXxlCVError) As Object Dim hr As Integer = &H800A0000 + xlError Return New ErrorWrapper(hr) End Function
The purpose of this method is to prepare the array by ensuring every element has a value. The twist is that when it finds an empty element, it inserts an Excel error using one of the ADXxlCVError constants. The effect is that, when eventually inserted into a spreadsheet, the user will quickly notice the cells with missing data.
There is a call to another CVErr function to convert the chosen constant to an error recognized by Excel.
Using Value versus Value2 versus Formula
When inserting arrays in to Excel spreadsheets, you have three property choices: Value, Value2, and Formula. Which one is the correct one? Before I give my opinion, let’s consider each of them first.
- Value: This is a parameterized property that lets you get and set the value for a specified range.
- Value2: Same as value BUT it doesn’t support Currency and Date data types. Also, it does not require a parameter.
Whether you use one over the other probably depends on your preferred language. Developers using C# tend to use Value2 as Value doesn’t even show up in Intellisense. Those of you who prefer VB.NET probably use Value2. For more background, you can read Eric Carter’s comments of Value versus Value2.
But there is another…
- Formula: This is the formula string of the range. This is exactly what you see displayed in the Excel formula bar when you select a cell in the Excel window. It’s common to insert a valid formula into this property… and thus set its value. I mention it here because if you select a 1D or 2D array, you can use an Excel array of the same dimensions to set the range’s Formula.
That’s that theory. I tend to utilize Value2 no matter what language I use.
Fill array with a query and insert into a worksheet
Lots of people like to fill an Excel workbook with data from a database. They then like to go completely nuts analyzing that data. For these types of users we can do some automation.
We can connect to a database.
We can query the database.
We can build an array and insert that array into the spreadsheet.
Take a look…
Private Sub ArrayByQuery(targetRange As Excel.Range) Dim cnn As OleDbConnection = New OleDbConnection() cnn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;" + _ "Data Source=[INSERT YOUR PATH HERE]\Northwind2007.accdb;Persist Security Info=False;" Dim sql As String = "SELECT Top 11 Customers.Company, Customers.[First Name], " + _ "Customers.[Job Title], Customers.[State/Province] From Customers;" Dim cmd As OleDbCommand = New OleDbCommand(sql, cnn) cnn.Open() Dim reader As OleDbDataReader = cmd.ExecuteReader() If reader.HasRows Then 'BUILD the array Dim dataToInsert As Object(,) = New Object(10, 3) {} Dim i As Integer = 0 While reader.Read dataToInsert(i, 0) = reader(0).ToString() dataToInsert(i, 1) = reader(1).ToString() dataToInsert(i, 2) = reader(2).ToString() dataToInsert(i, 3) = reader(3).ToString() i = i + 1 End While targetRange.Value2 = dataToInsert Else 'Maybe give the user a message to tell them "yo, you don't have any data." End If End Sub
The ArrayByQuery method does all these things. I admit I cheated and hard-coded the array dimensions. There are lots of ways to build arrays. The point here is that after populating the array with data, you simply insert it into a range. That’s it and that’s all.
Fill the array with a range and insert into another worksheet
Databases are fine and dandy and are a common contributor to worksheet data. But, all too often users need to copy data from other sections of an Excel workbook. The ArrayByRange sample accepts a string that represents a named range. It then copies it to the current selection.
Private Sub ArrayByRange(rangeName As String) Dim dataToInsert(,) As Object Dim destinationRange As Excel.Range = ExcelApp.Selection Dim sourceRange As Excel.Range = Nothing Try sourceRange = ExcelApp.Range(rangeName) Catch End Try If sourceRange IsNot Nothing Then ReDim dataToInsert(0 To sourceRange.Rows.Count, 0 To sourceRange.Columns.Count) dataToInsert = sourceRange.Value destinationRange.Value2 = dataToInsert Marshal.ReleaseComObject(sourceRange) End If Marshal.ReleaseComObject(destinationRange) End Sub
It works perfectly. Give it a try and you’ll see what I mean.
The fastest way – generate Excel files directly
These samples work with the Excel object model and its overhead. This is a suitable strategy if you are not looking to do bulk, array-insertion actions. But if you do need some to scale the solution, I recommend you re-architect to build Excel files directly.
To do this, you need to avoid the Excel object and utilize other APIs. Here are a few that let you do this.
- Open XML SDK: This is the Office team’s API for creating Microsoft Office documents without the need for Office. It’s free and fairly well documented.
- EPPlus: This .NET library is specific to Excel. It simplifies usage of the OpenXML SDK. It has a 5-star rating on Codeplex and I recommend it over trying directly working with the Open XML SDK.
- ASPOSE Cells for .NET: This is a commercial (i.e. not free) .NET library that also allows you to build Excel files directly. It has a wider feature set than EPPlus (it should, given it is not free) including specific use cases for inserting arrays.
***
Given these are samples and are meant to wet your whistle but not quench your thirst. I think you can easily extend these code examples to do more than just copy data. For example, in the second sample, you could copy the range, perform a calculation, then insert the results. As I heard in a conference call recently, “The possibilities are unlimitless!” No kidding.
Available downloads:
This sample Excel add-in was developed using Add-in Express for Office and .net:
Excel Populate With Arrays Add-in (VB.NET)
5 Comments
hi. I would like to ask is it possible to insert data into Excel using vb coding from Infopath form in Sharepoint? If possible, can you please show me the step as well as coding to do it? Thank you in advance for your feedback. :)
Hi Benny,
I believe you can using Excel Automation:
https://support.microsoft.com/kb/219151
You can automate Excel from InfoPath to create a spreadsheet and write your data to it.
Another option, is to avoid Excel Automation and use the Open Office XML API to build the spreadsheet without any need to automate Excel.
Hi, thank you for the explanation about the multi dimensional array being used to fill the excel cells accordingly when parsing the array to a range. Big help, thanks.
My visual studio code is not working and need some help please.
I have created a template on Excel and am planning to populate the table via a code in Visual studio. Code created is;
“Actuals”: [
{
“startDate”:”2017-03-01″,
“production_tasp_mwh”: 222762.4241,
“availability_contractual”: 0.9830,
“wind_energy_content”: 114.1,
“capacity_factor”: 0.475,
“availability_park”: 0.9516,
“availability_production_based”: “0.9244”
},
],
“Budget”: [
{
“startDate”:”2017-03-01″,
“production_tasp_mwh”: 208636.0,
“availability_contractual”: 0.975,
“wind_energy_content”: 108.0,
“capacity_factor”: 0.548,
“availability_park”: null,
“availability_production_based”: “null”
],
“Productions”: [
{
“month_start_date”: “2017-03-01”,
“budget”: 208636.0,
“actual”: 222762.0,
“budget_YTD”: 695452.0,
“actual_YTD”: “621272”
}
The table template created is
Production
Month Budget Actual
[[Production.month_start_date]] [[Budget.production]] [[Actual.production
Budget YTD Actual YTD
[[Budget_YTD.production]] [[Actual_YTD.production]]
The code is working but the values are not populated in the template. Not sure why and any help much appreciated.
Hello Samsudeen,
> Code created is
What you show is *not* an array. Please re-check the article.