The tab character has a long history in computing. Tabs were introduced in typewriters, where typists could specify one or more tab stops on the page. Pressing the Tab key would advance the carriage to the next tab stop. In ASCII code on computers, character 9 is designated as the tab. When displaying a tab character in a teletype-like display (e.g., UNIX terminal, Windows console program), the computer will advance the cursor to the next column that's a multiple of eight, where the count starts at column 0. For example, if the cursor is in any column from column 0 through column 7, a tab will advance the cursor to column 8 (which is really the ninth column because the computer is counting from column 0).

Related: Get to Know the PowerShell ISE

Tab characters are also used in other ways in computers. For example, various database and spreadsheet tools let you output data in tab-separated values (TSV) format, where tab characters separate the data items in each row. In addition, scripters and programmers have long debated amongst themselves about whether they should indent code using tabs or spaces. Both techniques have their advantages, but one thing is for sure: You can't tell whether a file contains spaces or tab characters using the Cmd.exe Type command, the Windows PowerShell Get-Content cmdlet, or Notepad because the tabs will appear as spaces.

Read: Top Ten: PowerShell Annoyances

To prevent confusion, it's often helpful to "de-tab" the contents of a file—that is, expand the tabs to the correct number of spaces. I like to do this for text files in which the tab characters are used for indenting, such as scripts, XML files, and HTML files. Although the More.com program in Windows can expand tabs to spaces, I created a native PowerShell function named Expand-Tab to perform this task so that I could take better advantage of PowerShell's pipeline.

Introducing the Expand-Tab Function

Listing 1 shows the short but handy Expand-Tab function.

Listing 1: The Expand-Tab Function
function Expand-Tab {
  param([UInt32] $TabWidth = 8)
  process {
    $line = $_
    while ( $TRUE ) {
      $i = $line.IndexOf([Char] 9)
      if ( $i -eq -1 ) { break }
      if ( $TabWidth -gt 0 ) {
        $pad = " " * ($TabWidth - ($i % $TabWidth))
      } else {
        $pad = ""
      }
      $line = $line -replace "^([^\t]{$i})\t(.*)$",
        "`$1$pad`$2"
    }
    $line
  }
}

For each line of input it receives, the function uses a regular expression to output the line with the tab characters replaced by the appropriate number of spaces. You can even specify the number of spaces you want to use for each indent (8 by default) or 0 if you want to remove the tab characters altogether. Let's take a look at how this works.

The Expand-Tab function uses a process script block to do something to each line of input it receives. First, the function assigns the variable $line to each input line (i.e., $_). Then, it uses a while loop that repeats until the input line doesn't contain any tab characters. The $i variable contains the position in the string where the tab character occurs. If $i is -1 (i.e., no tab character), the function uses the break statement to exit from the while loop.

Next, the function checks whether $TabWidth is greater than 0. If it is, the function creates a string, $pad, that contains the needed number of spaces using PowerShell's * operator. In PowerShell, string * n means "output string concatenated n times," so $pad will contain $TabWidth - ($i % $TabWidth) spaces. If $TabWidth is 0, $pad is set to "" (i.e., an empty string).

Finally, the function uses the -replace operator, which uses a regular expression to output a copy of $line with the tab characters replaced by $pad (i.e., the calculated number of spaces). Table 1 explains the components of the regular expression.

Pattern

Meaning

Table 1: Regular Expression Components

^

Find beginning of string

([^\t]{$i})

Not a tab character, $i times; ( ) = first group (i.e., $1 in the replacement string)

\t

A tab character

(.*)

Any character, 0 or more times; ( ) = second group (i.e., $2 in the replacement string)

$

Find end of string

`$1

Replace with first group

$pad

Replace with calculated number of spaces

`$2

Replace with second group

The backtick (`) character is needed in the replacement expression to prevent PowerShell from interpreting $1 or $2 as a variable name.

Using the Expand-Tab Function

I put the Expand-Tab function in my PowerShell profile so that it's always available. Here's an example of how to use it:

Get-Content t1.ps1 | Expand-Tab | Out-File t2.ps1

This command will get the contents of the t1.ps1 file, expand each tab to eight spaces, and save the "de-tabbed" contents in a file named t2.ps1. If the default tab width of eight spaces is too wide, you can specify a different tab width. For example, if you prefer two spaces, you'd use the command:

Get-Content t1.txt | Expand-Tab 2 | Out-File t2.txt

Note that you don't need to specify the -TabWidth parameter name. PowerShell knows that the function's first parameter is -TabWidth.

Take Control of Your Tabs

By adding the Expand-Tab function to your PowerShell profile, you'll no longer have to worry about whether your text files contain spaces or tabs. You can download the code for the Expand-Tab Function by clicking the Download button.