Documenting your directory structure provides insight about how your applications and server drives are built and where key components reside. Use the ActiveX Data Objects (ADO)-based VBScript application BuildDirTreeReport.vbs to create a directory tree report called DirTree.HTA that writes to a Microsoft Excel spreadsheet.

For a while now I've been toying with the idea of writing an easy-to-use, no-cost application for producing directory tree structure reports for applications and servers. Directory structure documentation can give you insight about how an application or server drive is built and where key components reside, as well as provide a convenient way for you to designate folders' importance. In a sense, an application or server's directory tree is like a report outline that you can use as a valuable resource if you need to analyze a network problem, develop maintenance scripts, or support an application.

My organization has an additional need for application and server directory tree documentation; our auditors require security information about our servers and applications several times a year. Every audit cycle, the auditors must review our directory structures and folder security by choosing somewhat random folders from high-profile directories in areas such as HR, Finance, Security, and specialized applications.

Until recently, we'd been supplying our auditors with a text directory dump derived from the Tree command. Unfortunately, the output and format of a Tree dump text file is a bit user unfriendly and not very easy to work with. For example, imagine you're an auditor examining a Tree dump text file: You're 8,876 lines down in a complex and deeply nested folder, and you see only a folder name. You have to copy and paste together all of the subfolder path information, all the way back to the root folder, to provide the complete path of the folders you want security information about. This method leads to many mistakes, which in turn causes delays and frustration on both sides.

My Solution

To streamline the process of producing directory tree reports in my company, I wrote an application called BuildDirTreeReport.vbs, which you can download by clicking the Download the Code button. I wrote this application to create a directory tree report called DirTree.HTA that writes to a Microsoft Excel spreadsheet. This method lets my company provide its auditors with Excel-based listings of directory and application folders that are easier to read and easier to work with. The DirTree.HTA report differs from the Tree report in that all of the folders in the listing include a full path. Auditors can simply copy the folder(s) they want to audit and send us a request listing those folders. They can also add a comment to the cells of folders they have questions about or would like to review, then send the entire spreadsheet back to us. This method works well, and our auditors seem to prefer it.

In addition, I modified the application, creating another useful application that builds a directory structure on a test computer based on an existing production directory structure. I use this application mainly for testing scripts that do a lot of file copying to many different paths. I can safely test my scripts without touching the production directory structure. I can also easily delete the complete test structure and rebuild it. Once I'm satisfied with a script, I can simply edit it to point to the real directory structure.

My script uses ActiveX Data Objects (ADO) to build a disconnect recordset that houses the directory structure data. After you have all the directory information in a database, you can easily cycle through the database, use the Find method to find folders, sort data, and write data to a spreadsheet. The script runs quickly, even on my older machine at work. On my Windows Vista laptop, the script screams though more than 16,000 directories on my C drive in about 90 seconds. You probably won't get such good results traversing a network, but even tests on my network were relatively fast.

How It Works

The first thing you need to do to run the application is enter values for a few variables that define the parameters necessary to gather the information you need:

  • SourceComputer—The computer you want to get the original tree structure from. Enter the name of the server in the input box. If you want to evaluate the local computer, leave the input box blank.
  • SourceDrv—The drive letter on the source computer that contains the folder you want to use to create the directory tree structure from. Enter the drive letter followed by a colon (e.g., F:).
  • SourcePath—The Folder you want to use as a starting point for the directory tree. Be as specific as possible when specifying the SourcePath (e.g., use F:\BaseFolder\apps\FolderIwant\ rather than \apps), because if you use a path that occurs in areas other than where you want to focus, the script might return folders you don't want. Note that the drive letter is optional here if you've already entered it in the SourceDrv input box.If you want to produce a directory tree starting from the root of the source drive, leave the SourcePath input box blank.

Figure 1 shows example application input in which you're running a query on the local computer and you want to build a directory tree of the drivers folder on the C drive.

Figure 1: Querying the local computer’s C:\drivers folder

Figure 2 shows the results of the query.

Figure 2: Results of querying the local computer’s C:\drivers folder

Figure 3 shows example input in which you're requesting a more precise folder tree listing, starting at C:\drivers\video.

Figure 3: Querying the local computer’s C:\drivers\video folder

Figure 4 shows the output from this query.

Figure 4: Results of querying the local computer’s C:\drivers\video folder

Details

Because backslashes are considered escape characters, when I use backslashes in a Windows Management Instrumentation (WMI) query, I perform a replace function on the source path, replacing all the single backslashes with double backslashes as follows:

SourcePath = Replace(SourcePath,"\","\\")

At the heart of the application is the WMI Win32_Directory query, which uses the SourceComputer, SourceDrv, and SourcePath variables. Note that a conditional statement checks to see whether SourcePath contains a value. The following code snippet gathers the primary directory structure data:

Set objWMIService = GetObject("winmgmts:" _
  & "{impersonationLevel=impersonate}!\\" & SourceComputer & "\root\cimv2")

 If Trim(SourcePath) <> "" Then
  Set colFolders = objWMIService.ExecQuery _
  ("SELECT Name,Path,CSName FROM Win32_Directory " & _
  WHERE Drive = '" & SourceDrv & "' AND Name Like '%" & Replace(SourcePath,"'","\'") & "%'",,48)
 Else
  Set colFolders = objWMIService.ExecQuery _
  ("SELECT Name,Path,CSName FROM Win32_Directory " & _
  WHERE Drive = '" & SourceDrv & "'",,48)
 End If

The main difference between the conditional branches is that if SourcePath contains a value, that value is used in the query, otherwise just the source drive is used in the query. The percent sign in the query represents a wildcard. The replace function takes care of any folder names that contain an apostrophe character (which doesn't occur very often). Note that because apostrophes are part of the actual structure of the query statement, they must be escaped (i.e., preceded with a backslash) if they exist in the context of your search.

Quick error check runs after this section of code. If an error is thrown, which indicates a failed connection or a bad query, the script terminates.

If all is well, the next step of the process is to create the ADO database that will be used to store all the directory data. This database has just four fields: Folder, Path, Drive, and Computer—all of which are string type fields of variable length. (For more information about creating an ADO database, see the Learning Path.) The Folder field will store Win32_Directory "name" property data returned from the WMI query; Path will contain data from the "path" properties; Drive will contain the drive letter stored in the SourceDrv variable; and Computer will store the name of the computer. The last two fields aren't necessary to produce the report—I included them in case I needed to access the database outside of this process, so that I'd have all the information necessary to produce other reports.

Listing 1 shows a portion of the code to create the application.

After creating the empty database that Callout A shows, the process populates the database by cycling through the collection of directory object items returned from the Win32_Directory query, as Callout B shows. This section of code assigns directory item values to database fields. The following code snippet from Callout B strips off the drive letter and colon from the folder name:

FolderName = Right(objFolder.Name,Len(objFolder.Name) _
 - nInStr(objFolder.Name,"\",1))

I already have the drive information in a variable, and the way I set up the script's logic requires that the folder field contain only a folder name without the drive letter. To further explain, the name property in a Win32_Directory collection contains the full path of the folder, including the drive letter and a colon, and it doesn't end with a backslash. In contrast, the path property contains the full path of the folder, doesn't include the drive letter, and always starts and ends with a backslash (e.g., \myfolder\level1\sublevel1\data\). In case no records are returned to the collection, the script also contains a conditional test that issues a popup message and terminates the program.

At Callout C, the first of two database sorting operations takes place. The initial sort (i.e., DRS.Sort = "Path ASC,Folder ASC") sorts by path, then folder, both in ascending order. This method lets you bypass any records that contain an empty path, as well as create a source path array. If you don't specify a source path in the SourcePath variable, the script assumes that you want to build a directory structure starting from the root of a drive. Sorting by path lets you store all the root-level folders in an array called RootArray simply by storing the folder name as an array element where the path field contains only a single backslash (which indicates that it's a root level folder). With just root-level folders in the array, you can easily find other folders in the database that begin with the root-level folder name and write them to a spreadsheet. In contrast, if a source path is provided, that value is stored to the RootArray array and is the only element in the array.

The code to manipulate the SourcePath variable into a useful string is a bit complicated, so let's examine the following code snippet to see what's happening. Basically we want to remove the leading slashes, replace any double slashes with single slashes, and store the results in the array.

slashstart = nInStr(SourcePath,"\",2)
  TotalSlash = HowManyInString(SourcePath,"\")
  slashend = nInStr(SourcePath,"\",(TotalSlash-1))
  midsec = slashend – slashstart
  SngSlashSrc = Replace(Mid(SourcePath,slashstart+1,midsec-1),"\\","\")
  RootArray(j) = SngSlashSrc

I use a couple of homegrown functions here: nInStr and HowManyInString. The function nInStr in the first line of code helps find the position of the second occurrence of the backslash in the SourcePath variable; this function is useful to help avoid the leading backslashes. In the next line, the function HowManyInString helps determine how many backslashes are in the SourcePath variable. In the third line, a combination of both functions helps determine the position of the second-to-last backslash. We now have a means of extracting what I call the midsection of the SourcePath variable, which essentially eliminates the leading and ending backslashes. The last piece of manipulation involves replacing the midsection double backslashes with single backslashes and assigning that value to the array element. At the end of this conditional test we end up with either an array of root folder names or an array with just a single folder name in it.

Now is when you can really see the benefits of having the data in an ADO-type database. You can change the database's sort order on the fly and sort the database by folder and path. I use the following line of code to change the sort order:

DRS.Sort = "Folder ASC,Path ASC"

Now you can simply cycle through your array, use the Find method of the ADO database to find the first record that matches the array element value, and cycle through your database while the database path is equal to the value contained in the array element. The Find statement looks like this:

DRS.Find("Path = '\" & Replace(rootEle,"'","''") & "\'")

The script adds a backslash to the beginning and end of the array value (which is called rootEle), because the path will always start and end with backslashes. The script also replaces single apostrophe characters with double apostrophes. The apostrophe is an escape character in ADO searches; the double apostrophe takes care of any folder names that contain an apostrophe.

If a record is found that matches the array element value, the final phase of the reporting process begins. The Do While statement within Callout D writes records to the spreadsheet as long as the left portion of the database path values contains the same value as the array element. This action lets the program return all folders that start with the array element value regardless of how deeply nested the subfolders are.

The next piece of code involves a conditional statement that lets you simplify the way you indent the folders when you write them to the spreadsheet. Although it might look complicated, the script is simply counting the number of backslashes contained in the array element and using that information to tell the program which spreadsheet column to write the folder name in. If the array element value contains no backslashes (which indicates that SourcePath=""), then the script uses the database folder field contents to determine the indent. This approach results in data that has the look of an indented directory tree, with all of its folders and subfolders. The process continues to cycle through the array until all the elements have been evaluated and all the associated database records have been written to the Excel spreadsheet.

Next, the database is written to "C:\Temp\Tree-" & cname & ".xml" and closed and the spreadsheet is displayed. If you don't have a C:\Temp folder, or you want the database to be saved elsewhere, you need to modify the path.

If you happen to enter a source path that can't be found but that exists within another folder structure, the application will produce a list of folders that contain the name you entered in the SourcePath input box. If you find the folder you intended to enter, you can simply copy it and paste it into the SourcePath input box, then run the app again. Figure 5 shows a request for a directory tree for the scripts folder in the D drive.

Figure 5: Querying the local computer’s D:\scripts folder

Figure 6 shows output indicating that the scripts folder wasn't found, but also shows all the folders that contain the word scripts at the beginning of the folder name.

Figure 6: Output indicating that the D:\scripts folder wasn’t found, and showing folders that start with scripts

Figure 7 shows the results of rerunning the query after creating a scripts folder in the D drive and adding several subfolders under that folder. Note that the script can't expand a query's scope. For example, if you search for “pictures” but you have only a folder named “my pictures,” the query won't return “my pictures.”

Figure 7: Results of rerunning the query after creating a scripts folder in the D drive and adding several subfolders

Maintain Your Structure

My solution is useful for documenting your servers after you complete your final build or after new applications are installed. The directory tree report that you build will provide a good foundation for maintaining your directory structure. In addition, you can highlight key folders and make notes that will come in handy down the line when you've forgotten some of what you did.