Contributing author Kent Empie combines a VB
CGI program with HTTP File Upload to securely transfer files
[Editor's Note: VB Solutions is about using Visual Basic (VB) to build a
variety of solutions to specific business problems. This column doesn't teach
you how to write VB, but how to use VB as a tool to provide quick,
easy-to-implement solutions that you can use right away.]
Many organizations need the capability to upload files from a browser to a
Web server. Although adding an FTP server can solve this problem, an FTP server
introduces extra security risks and administrative tasks. Opening up an FTP port
to the world increases your risk of unauthorized access from hackers because FTP
doesn't encrypt the user ID, password, or content of the file. In addition, the
FTP server and the Web server use two separate databases, which complicates
administration. This article, contributed by Kent Empie, presents an alternative
to FTP that solves the problem of secure file uploads using your existing NT Web
server and a Visual Basic (VB) implementation of the Common Gateway Interface
(CGI). Using a VB CGI program in combination with HTTP File Upload, you can
securely transfer files from a Web browser to your Web server.
An Overview of HTTP File Upload
Netscape first implemented HTTP File Upload in Navigator 2.0 in early 1996.
Since then, Microsoft has implemented it in Internet Explorer (IE) 3.02a and IE
4.0. HTTP File Upload lets the browser accept a filename in a text input field.
Screen 1 shows a typical HTTP File Upload form that an application
might present to a user.
To the right of the File Name input field, a Browse option lets the user
find a file via a standard File, Open dialog box. For security reasons (e.g.,
Web sites uploading files from machines without the user knowing it), the File
Name field cannot be hidden, nor can it contain a default filename. Once the
user clicks Upload File to submit the form, the contents of the file transfer to
the Web server.
Typically, an application that uses HTTP File Upload next displays a screen
that notifies the user whether the file transfer was successful. Screen 2 shows an example user notification screen for a successful upload. In this
example, the application notifies the user, displays the file name and size, and
prompts the user with a screen that captures information so that a search engine
can index the file. This example is just one type of application that you can
build with the HTTP File Upload capability.
Now that you've seen how HTTP File Upload looks to the end user, let's take
a look at the underlying components that make up the upload process. Screen 1 presents an overview of the HTTP File Upload process.
To begin the upload, the user first browses to a Web page on the Internet or
a corporate intranet. (If you use HTTP File Upload over the Internet, you need
to perform user authentication at this point.) As you saw in the example in
Screen 1, the Web page includes a form to select a file on the user's local
machine. The user enters a filename or browses to select a file from a local
directory. Next, the user clicks the form's submit button (Upload File in Screen
1
), which sends the contents of the form to the Web server. After the user
clicks the submit button, the browser begins reading the selected file. The
browser encodes the upload file as a multipart file type; that is, the browser
encodes the file with special boundaries in much the same way as mail programs
encode MIME files sent as attachments in mail messages. Once the Web server
receives the posted data, the Web server calls a custom CGI program (e.g., a VB
CGI program) that decodes the file and saves it to disk. The Web server invokes
the appropriate CGI program based on the name that's part of the form's POST
syntax. (For more information about the HTTP File Upload specifications, see the
sidebar, "Background on HTTP File Upload,")
Visual Basic Using True CGI
If you're new to the Web arena, you might not be very familiar with CGI. CGI
is a standard that programs use to communicate with a Web server on the server
side. A program that incorporates the CGI standard communicates with a Web
server in the following ways: It reads parameters at the command line, reads
from Standard In, writes to Standard Out, and reads information passed through
environment variables. CGI is not language specific. You can implement CGI in
any language that can communicate in the ways mentioned above.
To clarify one issue, the code in this article uses true CGI. Almost every
CGI book I've examined incorrectly states that VB is not capable of executing
true CGI programs. Before Microsoft released 32-bit VB 4.0, 16-bit VB 3.0
programmers had to use Win-CGI programming techniques to circumvent VB 3.0's
inability to read from Standard In and write to Standard Out. With the Win-CGI
workaround, programmers passed variables between the Win-CGI program and the Web
server using INI files. Although this method was a less efficient way to
communicate with the Web server than using true CGI, for 16-bit VB programmers
it was a life saver. However, all that changed with 32-bit VB 4.0, which can
read from Standard In and write to Standard Out by calling two Win32 API
functions: ReadFile and WriteFile.
Inside the Upload_CGI Program
Now that you've seen an overview of the HTTP File Upload process, let's look
at how you can create the VB CGI program that receives the uploaded file. To
read environment variables as well as to read from Standard In and write to
Standard Out, the upload_cgi application uses several functions that the Win32
API supplies. Because the Win32 API functions are in an external DLL, you must
declare them before you can use them in VB. Listing 1 shows the declarations for
the Win32 API functions that upload_cgi uses.
The upload_cgi application uses the GetEnvironmentVariable function
to read the environment variables from the Web server. GetEnvironmentVariable
takes three parameters: a string that contains the name of the environment
variable name, a buffer that contains the value of the environment variable, and
the size of the buffer.
The upload_cgi application calls the GetStdHandle function to get a handle
to the Standard In or Standard Out functions. GetStdHandle takes one parameter
that specifies the type of handle to be returned. A parameter value of
STD_INPUT_HANDLE causes the function to return a handle for Standard In, and a
parameter value of STD_OUTPUT_HANDLE causes the function to return a value
for Standard Out.
The Win32 API's ReadFile and WriteFile functions are similar, and each
takes five parameters. The first parameter is a handle to the file. To use
Standard In or Standard Out, this handle must be the one that GetStdHandle
returns. The second parameter is a buffer that contains the data for the read or
write operation. The third parameter is the number of bytes to read or write. In
the fourth parameter, the function returns the number of bytes actually read or
written. Finally, the fifth parameter designates whether overlapped I/O is to be
used. The upload_cgi application doesn't use overlapped I/O, so the
program sets this parameter to null.
The upload_cgi program starts after the Web Server receives the posted
data. Unlike most VB programs that begin by displaying a form (or window), the
non-graphical upload_cgi program begins by executing the Main subroutine, which
Listing 2 presents.
At callout A in Listing 2, you can see that the first subroutine Main calls
is the InitCGIVariables subroutine. InitCGIVariables simply calls the
GetCGIenvVar subroutine to retrieve each environment variable that the
Web server sends. Inside GetCGIenvVar is the Win32 API GetEnvironmentVariable
function. This function returns information about the browser, the server, and
the client's IP address, as well as other session information.
At B in Listing 2, the upload_cgi application calls the SendHeader
function. The SendHeader function begins building the HTML results form to be
sent back to the user when the file upload has completed.
At C in Listing 2, you can see where the upload_cgi program calls the
GetStandardInData subroutine, which reads from Standard In, using the Win32 API
ReadFile function. GetStandardInData, shown in Listing 3, reads the
data file that the user's browser sends.
GetStandardInData first calls the Win32 API GetStdHandle function to get a
handle for Standard In. Next, the subroutine uses a Do loop to read the data.
Within the loop, the VB String function makes sure that the gsBuff variable's
buffer is large enough to hold the data read from Standard In. Then,
GetStandardInData calls the ReadFile function using the handle that GetStdHandle
returned. The subroutine then reads the data stream from Standard In into a
buffer. GetStandardInData compares the string's size to the CGI_Content_Length
environment variable to determine when it has received all the information from
Standard In.
When GetStandardInData finishes, the Main subroutine resumes; at D in
Listing 2, Main parses the data that the browser posted. The browser sends files
in multipart format, and Main looks at the CGI_Content_Type environment variable
to determine the multipart boundary.
The example in Screen 2 shows what the CGI_Content_Type and CGI_Content_
Length environment variables might look like. When surrounded with the MIME
boundaries, the upload file looks like the example in Figure 1.
Because the file is encoded with the traditional MIME headers and
footers, handling the string the Web server receives can be a little messy. The
form the browser submits also sends the filename, including the full path of the
file on the client machine. All the VB program needs to do is strip the path
from the filename and write out the contents of the file in binary mode using
the correct filename.
At E in Listing 2, you can see where the upload_cgi program checks for the
target output directory and then writes the file to the Web server in binary
mode. The file is now available to any applications that need it. (On a security
note, don't place the uploaded files in a CGI directory or a public HTML
directory without first assessing the security risk.) After the upload has
completed, the Main subroutine sends a successful completion message to the
browser, with all the environment variables the subroutine used.
What Goes Up Must Come Down
You can underestimate the task of downloading a file on the Web because all
you have to do is make the file available in a public directory on a Web server,
and anyone with a browser can download the file. But the task isn't always that
easy. The Web server earmarks many file types for special MIME handling, even if
you simply want to download a file and save it on disk. Also, you may not want
to store your files on a public directory on a Web server. Even if you use
authentication, you may not want to use the Web server's access control list
(ACL) to decide whether a user has access to download a particular file.
Instead, you might want to use a smart Web application--an application that
determines access rights based on a set of events, such as whether the user has
filled out a questionnaire or entered valid credit card information. You can
easily handle HTTP downloads by using a CGI routine to send a file to a Web
browser. The upload_cgi program includes an example CGI download routine: the
DownloadFile subroutine shown in Listing 4.
When you implement HTTP file downloading, your CGI program first needs to
check whether the person is allowed to access the file in question. This
process, of course, depends on your environment and how you determine who can
access your files. If you deny a user access to the file, you need to send a
regular HTML header and a message to notify the user that the access criteria
were not met. If the user has access rights, the program needs to immediately
send a header that describes the file as a binary file. The format of the
download header, which DownloadFile sends at callout A in Listing 4,
is as follows:
Content-type: application/octet-stream
After sending the header, DownloadFile uses the Do loop at B in Listing 4,
to read from the disk file (which, of course, does not have to be in a
public HTML directory) in binary mode and call the Send subroutine. The Send
subroutine, shown in Listing 5, sends the data to the browser.
In Listing 5, Send uses the Win32 API GetStdHandle function to get the
handle for Standard Out. The first parameter of the WriteFile function is this
handle. The second parameter of WriteFile is the data to be transferred appended
with the carriage-return line-feed character. The third parameter contains the
length of the data to be downloaded, and the fourth parameter will contain the
number of bytes sent after the WriteFile function finishes executing.
Unlike a regular file download via an HREF tag, the Web server doesn't know
the contents of a file and sends the file as a binary stream. Therefore, the
server will not try to send the file as a particular MIME type. Let's look at
one possibility of how to call the CGI routine from the HTML form:
<FORM METHOD="POST" ACTION=
"/cgi-bin/file_download.exe?download:filename.doc">
<INPUT TYPE="SUBMIT" VALUE=
" file_name.doc ">
</FORM>
This example shows the download CGI program (file_download.exe) being
called and passed the download file's name (filename.doc) as a CGI Query string.
This arrangement works fine, but when the File, Save As dialog box shows up, the
default file name will be the name of the CGI program, not the name of the file
to be saved. To get around this problem, you can trick the browser into
providing the correct file name as the default, as shown in this modified ACTION
parameter:
ACTION="/cgi-bin/file_download.
exe/filename.doc?documents/filename.doc "
The correct CGI routine will still execute on the server side, but now the
File, Save As dialog box will default to the correct filename.
Just the Tip of the Application Iceberg
In this article, I've shown how to use a VB CGI program to do HTTP File
Uploads and downloads. The example upload_cgi program uploads a file to a
directory and then echoes the contents of that directory to the user. The user
can then download a file to verify that the upload worked properly.
You can easily modify this shell to meet lots of specific business
situations. For instance, you can create an Internet or intranet file warehouse
that allows uploading, indexing, and searching of the warehoused files. But this
idea is just the tip of the iceberg. Once you have adapted the program to your
company's needs, simply add user authentication and Secure Sockets Layer (SSL)
to your server, and you get a very secure method for transferring files
to your Web server.
I hv tried ur application for uploading a file of 12 MB which was very slow and infact i had to kill the process manually. Please let me know if there's any file size limitation for uploading a file
thank's for trying the seminar in vbusing windows api,
if u get any information about this plese send to above address
thanks once again
by
rajesh.a
received 2 errors while trying to open. Im guessing this is not compatible with VB6?
Can't see the article! Only the summary. Where is the full text?