News | February 3, 1999

Asynchronous I/O Utilizing Windows NT Threads

For a long time, test engineers have realized that a large percentage of their test times are spent waiting for instruments to complete a task. In automotive electronic functional test, this is especially true and may account for up to 50% of the total test time. This is due mainly to instrument setup, data acquisition, device settling times and relatively slow serial communications. In this article, I will show how to significantly reduce test times by using Microsoft Windows NT Threads.

Asynchronous test routines (tasks performed in parallel), is not a new idea, but have been very difficult to implement. Proprietary operating systems with weak and complicated support for asynchronous operations have been the main problems. Windows NT solves this problem by providing a standard, well-supported set of APIs (Application Programming Interfaces) for spawning and monitoring a set of asynchronous tasks called Threads. NT is not a panacea for this problem, but it does solve the infrastructure problem, leaving the user the responsible for the managing and sharing of resources and instrumentation.

An obvious application for threads is serial communications. Almost all automotive devices utilize serial communications to perform tasks in a car, as well as during test. The use of serial communications in cars was brought about mainly as a method to reduce the cost and weight of a car by replacing large number of wires with a single serial link. Serial downloads are also used at end-of-line manufacturing to replace firmware and program specific device features. At one of our customer's sites, I witnessed a final process station where approximately 10 older computers were configured for the sole purpose of reprogramming devices coming off the manufacturing line. These 10 computers could be replaced with one, providing a cheaper solution with comparable performance and reduced labor costs.

For this example, I used a 4-port Rocketport plug-in card to expand my serial ports. Cards that support up to 32 serial ports are available for larger applications. Hewlett-Packard TestExec SL is used to provide the test sequencing, instrument management, and performance profiling. The test I chose used a loop-back connection on all of the serial ports to provide validation for a write/read of a block of data.

In the typical synchronous case, a read/write of 500 bytes to one serial port is completed in about 540 milliseconds. The time required to perform the same operation using a thread was about 560 milliseconds, representing about 20 milliseconds of overhead for the thread operations.

The test was expanded to perform the same operation on four serial ports asynchronously and recorded a test time of about 560 milliseconds. There was no measurable additional time required to expand the test from one to four serial ports!

This experiment shows that the majority of time spent for serial communications is performed in hardware in an autonomous manner. This would not be possible if port specific data buffering and communication was not provided. In fact, many instruments operate in this manner. A digitizer, for example, needs to be configured to receive data and then will operate completely autonomously. Unfortunately, most applications will sit in a loop monitoring a busy bit, waiting for the operation to be completed. The key to increasing performance is to identify which operations can be executed autonomously, and to spawn those tasks with a thread and return at a later point to get the results. Of course, you need to manage the dependencies and the sharing of resources or instrumentation.

Like most test platforms, standard routines are provided for configuring, writing and reading to a serial port. For this test, I created two new routines, scommSpawnSendBytes(…) and scommWaitForSendBytes(…). scommSpawnSendBytes() will initialize a data structure with the data to transmit and spawn a thread to call the standard function scommSendBytes(), via a call to CreateThread(). CreateThread() will return a handle to the thread that will be used to check the status of the operation spawned. I assigned this handle to my data structure and attached the structure to the serial instrument handle so that I could access the information later. In this case, the thread was set to execute immediately, although a thread can be configured in an idle state and a call to ResumeThread() will start the thread. scommWaitForSendBytes() is a simple routine that retrieves the handle to the thread, and calls WaitForSingleObject() giving the thread handle and a maximum time to wait.

My test plan calls scommSpawnSendBytes() for the four serial ports in my system, followed by four calls to scommWaitForSendBytes(). The final step is used to call the standard function scommReceiveBytes() for the four ports to receive the data. Is was interesting to notice that the first call to scommWaitForSendBytes() was the slowest, as it was in fact waiting for the operation to complete. Once test this was complete, the other three ports were also finished and the functions returned immediately.

Windows implementation of threads basically requires that you provide a function to call that will receive a pointer to your specific data. In this example I created a function called SerialSendBytes (CSerialThread* cmd) and pass a pointer to my data structure. It is important to ensure that no other operation will access this data. There are other APIs that that can be used to create thread-safe data and synchronize access to the data. These routines include Thread Local Storage (TLS), mutex, semaphores and critical sections. In this example, I don't use critical sections because I know that each instrument handle has a unique copy of the transmission data.

Click here to see Figure 2.

In the case where the thread did not complete normally after waiting the maximum amount of time this operation should have taken, I can assume something detrimental to the program occurred. In a synchronous program this would result in hanging the system. However, with threads we can terminate the thread, fail the test and continue on. This is an especially important feature in manufacturing function test where the whole purpose is to identify devices that are not operating properly and to handle this case without hanging the test system. However, terminating one thread from another is considered bad practice as the threads' stack is not de-allocated. However, if you have a small stack, this is not a problem. Microsoft NT does provide a better mechanism for terminating rogue operations called Fibers. A thread may contain multiple fibers and may be terminated without memory loss. This is a more advanced topic and will not be discussed here, but if terminating rogue operations is a common practice for you, then you should consider using fibers.

Microsoft NT also uses the thread priority classes concept as well as priorities within a class. These routines allow a user to increase or decrease the scheduling priority of a particular thread. In this example I did not modify the priorities of this thread, and I would recommend extreme caution using these routines as they can result in system deadlock by preventing system critical threads from executing.

Another consideration when using threads is the total amount of time a measurement will take. I timed the creation of a thread to take about 1 millisecond. If your routine does not take significantly more time than this, then the overhead associated with creating a thread is probably not worth the time.

Please send comments or questions to: bill_sontag@hp.com

Author's Biography

William J. Sontag received his Bachelor of Science, Computer Science in 1985 from Fitchburg State College, Fitchburg, Massachusetts. He has been with Hewlett-Packard for 5 years working on the initial development stages of the HP TestExec SL and the HP TS5400 Automotive Electronics Functional Test System. Prior to joining HP Bill got practical experience working multiple jobs in the Computer Aided Design industry.