Downloads
139518.zip

The majority of software distribution occurs electronically. However, the larger the downloads, the larger the risk of corrupted data transfer. Hence, it's very useful to be able to verify the integrity of downloaded files. Cryptographic hashing algorithms provide one way to do this. A hashing algorithm takes a series of bytes (such as the bytes of a file), performs a calculation using those bytes, and produces an output value of a fixed size (e.g., 128 bits, 160 bits). The goal of these hashing algorithms is that no two inputs should produce the same output. Two common hashing algorithms are the Message Digest 5 Algorithm (MD5) and Secure Hash Algorithm-1 (SHA1). These algorithms have been shown to contain flaws (i.e., there’s the possibility that two different inputs can produce the same output), but they’re robust enough to verify file integrity in the vast majority of cases.

Figure 1: An SHA1 hash value for an .iso file
Figure 1: An SHA1 hash value for an .iso file

Figure 1 and Figure 2 show practical examples of hash values. Figure 1 shows an SHA1 hash value for an .iso file on Microsoft TechNet. Figure 2 shows two MD5 hash values for OpenOffice.org installers. If you download these files, you can calculate the SHA1 or MD5 hash values to verify whether the files downloaded without any data corruption.

Figure 2: MD5 hash values for OpenOffice.org installers
Figure 2: MD5 hash values for OpenOffice.org installers

 

Introducing Get-FileHash.ps1

Microsoft doesn't provide a command to calculate hash values for files, so I decided to write a Windows PowerShell script, Get-FileHash.ps1, that calculates MD5 or SHA1 hash values for files using the Microsoft .NET Framework. The script requires PowerShell 2.0 or later. You can download it by going to www.windowsitpro.com, entering 139518 in the InstantDoc ID text box, and clicking the 139518.zip hotlink. I recommend placing the Get-FileHash.ps1 file in a directory in your path.

To execute the script, follow the syntax

Get-FileHash [-Path] <String[]>

  [-HashType <String>]

or

Get-FileHash -LiteralPath <String[]>

  [-HashType <String>]

The -Path parameter name is optional and specifies one or more files for which you want to output a hash value. Wildcards are permitted. The script will accept pipeline input in place of the -Path parameter.

If you want to specify the name of a file that contains characters that PowerShell normally interprets as escape characters (e.g., the square bracket characters [ and ]), you can use the -LiteralPath parameter and one or more filenames. If you use -LiteralPath, you can’t use wildcards and the script will ignore pipeline input. Note that the -Path and -LiteralPath parameters are mutually exclusive.

The -HashType parameter's value must be the string MD5 or SHA1. If you omit -HashType, MD5 is the default.

Get-FileHash.ps1 outputs objects containing each file's path and its MD5 or SHA1 hash value. Figure 3 shows a sample command and its output. In this command, the filenames are being provided through pipeline input.

Figure 3: Sample command and its output
Figure 3: Sample command and its output

 

Understanding the Script

Get-FileHash.ps1 uses two features new to PowerShell 2.0 and later: Comment-based help and advanced function parameters. Comment-based help enables the Get-Help cmdlet to display help information for the script. Advanced function parameters allow the script to behave like a cmdlet.

Comment-based help is a series of comment lines (lines beginning with #) or a comment block (text enclosed between <# and #>) that contains special keywords that PowerShell uses to generate help information. If you use the command

Get-Help Get-FileHash

PowerShell uses the special keywords (e.g., .SYNOPSIS, .DESCRIPTION, .PARAMETER) to generate the help text. Comment-based help is a great addition to PowerShell 2.0 that makes it very easy to self-comment functions and scripts. Run the command

Get-Help about_Comment_Based_Help

at a PowerShell prompt for more information about how to use comment-based help.

Advanced parameters cause PowerShell to use cmdlet-like rules for parsing the script's command-line parameters. Get-FileHash.ps1 uses parameter sets, which enable the script to accept mutually exclusive parameters.

Listing 1 shows the script's CmdletBinding attribute and param statement. CmdletBinding enables cmdlet-like behavior for the script's parameters and specifies the default parameter set. The param statement contains three parameters, which are declared with Parameter statements. Each Parameter statement includes attributes that establish the parameter’s behavior. The attributes are as follows:

  • ParameterSetName="Name": Specifies the parameter set to which the parameter belongs (either Path or LiteralPath). If a parameter doesn’t specify a parameter set, it’s valid for any parameter set. The ParameterSetName property of the $PSCmdlet object contains the current parameter set name.
  •  Position=n: The parameter's position on the command line. Position=0 means the parameter must appear first, Position=1 means the parameter must appear second, and so forth.
  • Mandatory=$TRUE: Specifies that the parameter is required. If the parameter isn’t specified, PowerShell will prompt for input for the parameter.
  • ValueFromPipeline=$TRUE: Specifies that the parameter's input can come from the pipeline.


For more information about these attributes, run these commands at a PowerShell prompt:

Get-Help about_Functions_Advanced

Get-Help

  about_Functions_Advanced_Parameters

Get-Help

  about_Functions_CmdletBindingAttribute

(Although the last two commands wrap here, you'd enter each command on one line in the PowerShell console.)

After the param statement, Get-FileHash.ps1 uses the begin and process scriptblocks to carry out the script's cmdlet-like behavior. The begin scriptblock executes once before the pipeline processing, and the process scriptblock executes once for each pipeline item. If there is no pipeline input, the begin and process scriptblocks each execute once.

Inside the begin scriptblock, the script validates that the -HashType parameter is either MD5 or SHA1 and creates the $Provider variable, which contains the .NET cryptography object that computes file hashes. Next, the script determines whether the -Path parameter appears on the command line and whether it’s bound. If the -Path parameter is present but not bound, the script assumes the input will be coming from the pipeline and sets the $PIPELINEINPUT variable to true.

The begin scriptblock also contains the get-filehash2 function, which is really the workhorse function of the script. I’ll describe the get-filehash2 function in a moment.

Inside the process scriptblock, the script checks to see whether the Path parameter set is active (i.e., the -Path parameter was used). If the Path parameter set is active, the script checks the $PIPELINEINPUT variable’s value to determine whether it should take input from the pipeline or from the content of the -Path parameter. If there is pipeline input, the script executes the get-filehash2 function for each input object. If there is no pipeline input, the script uses the Get-Item and ForEach-Object cmdlets to send input to the get-filehash2 function.

If the Path parameter set isn’t active (meaning LiteralPath is the active parameter set), the script uses the Get-Item cmdlet with its -LiteralPath parameter to retrieve the file. If the Get-Item cmdlet succeeds (that is, the $file variable isn’t empty), the script passes the $file variable as a parameter to the get-filehash2 function.

 

The get-filehash2 Function

As I mentioned previously, the get-filehash2 function, shown in Listing 2, is the workhorse function of the script. It performs three tasks:

  1. It validates whether the $file parameter’s value is really a file. This is necessary because PowerShell paths can refer to items other than files, such as registry subkeys and directories.
  2. It calculates the file's MD5 or SHA1 hash value. The function calls the cryptographic provider's ComputeHash method, which calculates a hash value based on a stream of bytes (in this case, the contents of a file). This result is returned as a string of bytes, so the function uses the .NET StringBuilder object to build a string containing these bytes as a hexadecimal string.
  3. It outputs a custom object containing the file's full name and its hash value. The function uses the Select-Object cmdlet to output this custom object.

File Hashing Made Easy

Get-FileHash.ps1 places the power of the .NET Framework's MD5 and SHA1 file hashing algorithms at your fingertips. With Get-FileHash.ps1, you're no longer bereft of an easy-to-use tool for calculating MD5 and SHA1 file hashes from the PowerShell command line.


Listing 1: The CmdletBinding Attribute and param Statement

[CmdletBinding(DefaultParameterSetName="Path")]

param(

  [Parameter(ParameterSetName="Path",Position=0,Mandatory=$TRUE,

    ValueFromPipeline=$TRUE)]

    [String[]] $Path,

  [Parameter(ParameterSetName="LiteralPath",Position=0,Mandatory=$TRUE)]

    [String[]] $LiteralPath,

  [Parameter(Position=1)]

    [String] $HashType="MD5"

)

Listing 2: The get-filehash2 Function

# Returns an object containing the file's path and its hash as a hexadecimal string.

# The Provider object must have a ComputeHash method that returns an array of bytes.

function get-filehash2($file) {

  if ($file -isnot [System.IO.FileInfo]) {

    write-error "'$($file)' is not a file."

    return

  }

  $hashstring = new-object System.Text.StringBuilder

  $stream = $file.OpenRead()

  if ($stream) {

    foreach ($byte in $Provider.ComputeHash($stream)) {

      [Void] $hashstring.Append($byte.ToString("X2"))

    }

    $stream.Close()

  }

  "" | select-object @{Name="Path"; Expression={$file.FullName}},

    @{Name="$($Provider.GetType().BaseType.Name) Hash";

      Expression={$hashstring.ToString()}}

}