Design of the DEC LANcontroller 400 Adapter By Richard E. Stockdale and Judy B. Weiss Abstract the XMI bus either as The DEC LANcontroller the system bus (VAX 6000 400, Digital's XMI-to- systems) or as an I/O Ethernet adapter (DEMNA), bus (VAX 9000 systems). connects systems based It is an intelligent on the Digital XMI bus adapter that implements to an Ethernet/IEEE 802.3 the physical layer and part local area network (LAN). of the data link layer of These systems use the XMI network protocol. The term bus either as the system intelligent refers to the bus (VAX 6000 systems) packet processing performed or as an I/O bus (VAX by the adapter as part of 9000 systems). The new the data link layer. systems, which can utilize The DEMNA adapter was the full bandwidth of the needed to support the I/O Ethernet, are characterized requirements of the VAX by increased host processor 6000 and VAX 9000 systems, speeds. The DEMNA adapter which can utilize the full was designed to support bandwidth of the Ethernet. these I/O requirements. The adapter also provides In addition, console and the ability to configure monitor facilities were these systems without a built into the adapter BI bus. For these systems, firmware for debugging, the DEMNA adapter is the verification, and user only Ethernet connection visibility. The adapter's available. performance for small The DEMNA adapter is packets exceeds system controlled by a port capabilities, and Ethernet driver that resides in bandwidth is the limiting host memory. The interface factor for large packets. between the port driver The high-performance and the DEMNA firmware DEC LANcontroller 400, (the port) is a ring-based Digital's XMI-to-Ethernet design which is optimized adapter (DEMNA), connects for low system overhead and a system based on the high performance. Digital XMI bus to an The DEMNA adapter has the Ethernet/IEEE 802.3 local following major features: area network (LAN). This adapter is intended for o Supports Ethernet/IEEE Digital systems that use 802.3 protocols Digital Technical Journal Vol. 3 No. 3 Summer 1991 1 Design of the DEC LANcontroller 400 Adapter o Supports up to 64 users (each one a separate protocol such as local This paper begins with a area transport [LAT] logic overview of the DEMNA software, DECnet network device. The sections that software, or clusters) follow discuss the factors o Supports two modes of that influenced design and addressing: VAX virtual implementation, describe addressing and 40-bit the major performance physical addressing metrics and user visibility operations, and review the o Allows buffer chaining design results and future on transmit needs. o Performs packet Logic Overview filtering and validation The DEMNA adapter is a on receive single-board XMI adapter o Supports Digital's based on complementary maintenance operations metal-oxide semiconductor protocol (MOP) functions /transistor transistor o Provides support for logic (CMOS/TTL) diagnostic routines and technology. As shown in field service functions Figure 1, the hardware implemented through consists of four separate the system console or subsystems: diagnostic software o Microprocessor o Has console and monitor o Direct memory access facilities that allow a (DMA) and shared memory console user to monitor o XMI interface DEMNA operation and network utilization o Ethernet The microprocessor but is copied to RAM for subsystem contains the execution. The boot ROM CMOS VAX (CVAX) processor, contains the initialization system support chip (SSC), code and diagnostics. This boot read-only memory subsystem also provides a (ROM), Ethernet address console interface through programmable read-only the SSC for diagnostics, memory (PROM), electrically module debugging, and erasable programmable read- network monitoring. only memory (EEPROM), The DMA and shared memory and random-access memory subsystem provides the (RAM). The microprocessor means of communication subsystem provides an between the CVAX processor internal, high-speed and the other subsystems. CDAL bus so that the CVAX The devices arbitrating processor can fetch its for this shared memory are instructions and execute the CVAX processor, the them without being delayed gate array, and the Local by the other controllers on Area Network Controller for the module. The firmware Ethernet (LANCE) chip. is stored in EEPROM, 2 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter The XMI interface subsystem o Deliver high contains the XMI network performance, measured by adapter (XNA) gate array the amount of Ethernet and the XMI corner. The bandwidth supported at XNA gate array is the data- various packet sizes, move engine for the DEMNA with minimized host adapter and contains all overhead the XMI-required registers. o Supply debugging The Ethernet subsystem features for design contains the LANCE chip, verification and field the serial interface maintenance of the adapter (SIA) chip, and adapter various bus interface First, we reviewed previous logic modules. The Ethernet adapters to determine subsystem receives packets what improvements could from the Ethernet and be made. We learned that stores them in the shared a complex host interface memory. When transmitting complicated host software a packet on the Ethernet, and adapter firmware the LANCE chip gets the and greatly affected packets from shared memory performance. One of these and transmits them on the adapters, the Digital BI Ethernet. Ethernet Network Adapter (DEBNA), implemented a Design generic port interface that The design of the DEMNA used interlocked queues adapter was influenced by containing a queue entry many factors, including with a buffer name that previous adapter design indexed into a buffer experiences, available descriptor table (i.e., hardware such as Ethernet an additional level of chips, and system indirection). In addition requirements. The DEMNA to the firmware complexity, team was assigned the the hardware was not well following tasks: suited to a complex port interface. o Produce a working Another area in which Ethernet adapter improvements could be made that could be used by over previous Ethernet operating systems such adapters was the amount as VMS, ULTRIX, ELN, of processing performed by and custom operating the host processor during systems on hardware receive packet filtering, configurations that use address translation, and the XMI bus as a system buffer copies. Overall bus or an I/O bus system performance improves if this processing can be reduced by performing part or all of these functions in the adapter. This difference transforms the Digital Technical Journal Vol. 3 No. 3 Summer 1991 3 Design of the DEC LANcontroller 400 Adapter adapter from a dumb adapter We designed a simple (much of the data link host interface, using processing performed by the rings instead of queues. host) to an intelligent Interrupts to the host adapter (much of the were kept to a minimum, processing performed by from one interrupt per the adapter). packet at light loads to The results of our analysis a fraction of that number of older Ethernet adapters under heavy loads. As seen led us to choose a design in Figure 2, the port and that employs a simple the port driver (host) host interface, off- share the following data loads the host whenever structures, which reside in possible, uses rings host memory: instead of queues, and o Port data block. This supplies the address of structure gives the the buffer directly with port the location of the the ring entry rather than rings and page tables indirectly through another in host memory and is data structure. a repository for error The design of the adapter information. was now consistent with the o Command and receive needs of the new VAX 6000 rings. These rings and VAX 9000 systems. These contain information systems, characterized by describing outstanding increased host processor command and transmit speeds, needed increased requests and buffer I/O performance. The task information for receive of the DEMNA team was to buffers. fill that need for Ethernet o Transmit, receive, and I/O. command buffers. These Type of Adapter buffers contain packet data and command data. The DEMNA product is a These data structures store-and-forward adapter, constitute the primary i.e., it copies data to means of communication and and from host memory by data transfer between the way of temporary storage port and the port driver. on the adapter. This data Control status registers transmission differs from (CSRs) are provided for that of a cut-through port poll demand registers, adapter in which data XMI context, and port flows directly between host initialization. memory and the transmission medium. However, the DEMNA Two rings are used in the adapter is actually able to host interface: the command gain some of the benefits ring and the receive ring. of cut-through on the Each ring consists of receive side. 1024 bytes of physically Host Interface contiguous memory, and each ring contains entries that 4 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter describe a buffer or a set of buffer segments (when chaining transmit buffers). The number of entries in the receive ring is fixed, since each entry points to a single contiguous buffer. The size of each transmit ring entry is variable and is fixed at initialization time. The port and port driver process the entries in each ring in sequential order, starting with the first entry. A ring entry can be processed only by its owner. When the last entry in the ring is reached, processing starts again with the first entry. Digital Technical Journal Vol. 3 No. 3 Summer 1991 5 Design of the DEC LANcontroller 400 Adapter Host interrupts are CVAX RAM (used by the CVAX minimized by using a ring processor exclusively) release function, which consists of 256 kilobytes counts the number of ring and contains the firmware entries processed for and data structures (the completion by the port firmware is copied to RAM and the port driver. The during self test). Smaller port driver counts the RAMs would have been number of completed entries slightly less expensive but and writes this count to would have complicated the a completion CSR when it firmware update procedure has finished processing and limited the ability all the completed transmit of the firmware to use and receive ring entries. the large data structures The port maintains the same needed for receive packet count and issues another filtering. interrupt whenever it sees Shared RAM (shared by the that its count and the CVAX processor and the count last written by the LANCE chip) consists of port driver are different. another 256 kilobytes. This This function ensures RAM contains the transmit that the port driver is and receive buffers as well interrupted only when it as the LANCE transmit and stops processing the rings receive rings. There is a because there is nothing vast amount of buffering else to process. The port space here, so the DEMNA driver can process multiple device can tolerate a completed transmits and considerable amount of receives after each inattention from the host interrupt as well. Thus, before being forced to no spurious interrupts discard incoming receive are issued and the number packets. of interrupts is reduced Erasable programmable by processing multiple read-only memory (EPROM) completions at once. consists of 128K bytes for Adapter Design diagnostics and firmware The firmware is written boot code, including a in VAX MACRO code. An backup copy of sufficient alternative was to use operational firmware MACRO for the transmit to allow an update of and receive paths and a EEPROM for initial load or higher-level language for subsequent update. EEPROM initialization, shutdown, consists of 64K bytes for and error handling. operational firmware, However, this approach diagnostic patches, and was not chosen because it error history data. complicates the interface The gate array (data mover) and would have resulted in handles the data move firmware size difficulties. and quadword read/write operations. The data-move operations transfer buffers 6 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter between the host and shared requests directly to the RAM. The quadword read adapter. /write operations are used For ULTRIX systems, the for control functions, such driver runs at a lower as reading ring entries, level with respect to reading address translation packet filtering so it information, and writing cannot take advantage of ring status on completion. this feature. However, Once the firmware initiates buffer chaining is used a data-move operation, on the transmit side. As a other work is performed by transmit request traverses the firmware while the data the various software move progresses. layers, it accumulates Interrupts are very buffer segments which the costly; therefore, we driver has to concatenate chose to limit the number into a transmit frame. of interrupts fielded To avoid buffer copies by the CVAX processor. in all but the extreme A LANCE interrupt costs and infrequent cases, the CVAX interrupt overhead, driver then passes up to plus a LANCE CSR access, 11 buffer segments to the plus some normal interrupt adapter. overhead to save and To allow customer-written restore registers. A data- drivers for special move interrupt is less applications, we documented costly, but the firmware the interface to make can be coded so that the it readily available to data-move operation is customers. usually complete, thus eliminating the need for Debug Tools the interrupt. Polling is The adapter has a very performed for all LANCE- simple mission in life: and data-move-related to transmit and receive functions, but interrupts packets. To verify are used for local console operation, some debug I/O and error events. tools are needed. The Driver Design goal for the DEMNA team The DEMNA team needed to was to provide extensive design a driver that would debug tools both in the be compatible with existing operational firmware and drivers but that would use in standalone user tools. all the features provided This design would allow by the adapter. For VMS debugging and verification systems, this meant using in the development lab and the set of common routines in other, less-controlled that provide much of the environments. These debug data link functionality of tools are discussed further the driver, but avoiding in the Visibility section. packet filtering. Another goal was to limit the copying of data by passing Digital Technical Journal Vol. 3 No. 3 Summer 1991 7 Design of the DEC LANcontroller 400 Adapter Implementation The command ring usually This section describes the contains transmit buffers, implementation of the DEMNA which can contain commands adapter through its major for special functions. functional blocks: These commands are included in the command ring to o Scheduler allow the port driver o Port processing to synchronize control o Command processing requests with transmit requests, e.g., user o Transmit task startup and stopping. o Receive task Command processing routines o Console task are called by the transmit o Monitor task task after the command buffer has been read Scheduler from host memory. The The scheduler is a round- commands consist of user robin routine that simply startup (consisting of user checks for work, does it, context such as protocol checks for work, does it, type, packet format, etc. There are no context physical address to use, switches, but some context and multicast addresses is maintained in registers to enable), user stopping, and shared by all routines. read counters, and a set of The scheduler, when idle, maintenance commands. consists of about 18 Transmit Task VAX MACRO instructions. The transmit task copies Transmit and receive a packet from the host tasks are given higher memory to adapter buffer priority by duplicating memory and tells the their scheduler entry. When LANCE to transmit it not idle, one pass of the onto the Ethernet (store scheduler processes four and forward). After the packets. LANCE has completed the Port Processing request, the firmware Port processing controls writes transmit status adapter initialization to the command ring entry, and shutdown, LANCE signifying completion of initialization and restart, the transmit. fatal adapter error To minimize service handling, gate array error time, the code in the handling, and miscellaneous transmit path was carefully host interface functions. scrutinized. The number This task also handles of checks and branches firmware updates of EEPROM. was minimized for the Command Processing optimized path. The optimized path through the transmit code is the 30-bit virtual addressing path, which is the most used. However, the 40-bit 8 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter physical addressing path in small groups (192 bytes) still results in better to allow the benefit of throughput because this cut-through on larger path does not require any packets. address translations, which Packet filtering is done are timely. The instruction for the destination address sizes were shortened when and for user type, either possible, using word protocol type for Ethernet, instructions instead of destination service access longword instructions, point (DSAP) field for 802, to reduce the amount of and protocol identifier instruction prefetch by the (PID) value for 802 CVAX processor. Routines subnetwork access protocol were placed on quadword (SNAP) packets. Additional boundaries to maximize filtering is done for cache efficiency. When users who request all waiting for data moves traffic or all multicast to complete (getting traffic. Filtering is the transmit buffer from done by maintaining a host memory) or obtaining 64-bit user mask, which address translation accumulates the list of information from the host, users who want a copy of the firmware was designed the packet according to to perform other functions the characteristics of the to increase the probability packet and what each user that the operation would be has requested. performed when the firmware needed it. Packet validation consists Receive Task of length checks for Ethernet frames (if the The receive task has the user is using a length simple job of handing field after the protocol received packets to the type) and for 802 frames. port driver. This task This saves the driver a is complicated by the little work. Additionally, need to off-load the users can request only host of part of receive packets smaller than a processing (including selected size; the adapter packet filtering, packet discards packets that validation, maintenance of exceed this size. counters, and processing The cut-through feature MOP messages) and to make adds complexity and duplicates of packets when reduces throughput on more than one user has small packets, but provides requested a copy. It is many benefits for larger further complicated by the packets. When a packet need to provide buffering, larger than 192 bytes which the port driver uses is received, the packet to prevent the driver from filtering and validation supplying large numbers of all but the length is of buffers. For enhanced done for the first segment. performance, the firmware This segment is then copied deals with receive packets Digital Technical Journal Vol. 3 No. 3 Summer 1991 9 Design of the DEC LANcontroller 400 Adapter into the host buffer, and where it is formatted and subsequent segments are displayed on the screen. copied appropriately. The Due to code size last segment completes limitations in the EEPROM, the packet validation and compressed versions of cyclic redundancy check the console screens are (CRC). The difficulty stored in the EEPROM. At occurs when the packet initialization time the validation fails or an screens are uncompressed error is detected, because and stored in the RAM. the packet is discarded (The screen compression and the context for the saved 5 kilobytes in the now-free receive buffer EEPROM.) To easily setup has to be restored. The and maintain the screens, firmware elects to save especially since they often as little context as changed during the project, possible for each packet the screens were set up and to regenerate buffer in separate text files. context after the error, The fields in the screen i.e., fetching the ring were coded with different descriptor anew and redoing data types, such as date the address translation. or longword. The screen was Console Task then put through a PASCAL The console task accepts program to convert it to and parses console commands a VAX MACRO data structure and displays the requested and compress it. data. There are two means The local console and the of accessing the console: remote console can be run local and remote. The local simultaneously. They have console is accessed by a separate input and output terminal connected directly buffers, the same decode to the DEMNA adapter. The and formatting code, and remote console is accessed different input and output through MOP console carrier methods. commands directed at the The remote console uses adapter from another the MOP console carrier, system. A remote console coming in on transmit or may also be used to access receive. The command/poll a DEMNA device on the and response/acknowledge local system (coming in commands are sent by through transmit instead of the MOP program, i.e., receive). The firmware either the network control does not distinguish program (NCP) or a user between transmit or receive program that implements operations from remote the MOP console carrier. consoles. The console block The console code extracts accepts the commands and the input characters from decodes them, and the the command/poll packet monitor block determines and returns a response the status. The monitor /acknowledge packet with block passes this status any available data from back to the console block 10 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter the remote console output Performance buffer. When a command has As stated previously, the been entirely received, primary goal of the DEMNA it is decoded and executed adapter was high-speed and the response placed in performance, i.e., this the remote console output adapter would not create buffer, which is sent back a bottleneck when placed to the user in response in a system. The major /acknowledge packets. performance metrics we The local console is a identified were throughput, terminal directly connected service time, latency, and to the DEMNA device and reliability. interfaced through the SSC o Throughput is the number universal asynchronous of packets or bytes of receiver transmitter packet data that can be (UART). This terminal transmitted or received connection receives and per unit of time. transmits one character at a time. Characters o Service time is the time are collected into the a packet spends in each local console input buffer stage along its path and complete commands from source through host are parsed and executed. software and driver, Response data is placed in through adapter, over the local console output wire, through adapter, buffer. The local console and through driver and uses interrupts to signal host software to the when a character has been destination. typed or when the UART is o Latency is another ready to transmit another measure of service character. These are the time. It is a measure only interrupts used on the of delays encountered module, except for error by queue depths of more interrupts. Since console than one at various interrupts are relatively points. infrequent, they are less o Reliability is measured costly than polling. as the probability Monitor Task of packet loss under The monitor facility a receive load. It operates mainly during is also measured as receive or transmit. It adapter buffering and also runs as a low priority host buffer allocation entry in the scheduler to effectiveness. For some deal with debugging and protocols, recovery verification activities from packet loss takes (when debugging firmware is a significant amount enabled). of time, and the loss of a packet may be quite noticeable to a user. Hence, recovery is related to a user's Digital Technical Journal Vol. 3 No. 3 Summer 1991 11 Design of the DEC LANcontroller 400 Adapter perception of reliable The primary hardware operation. factors influencing The performance goal of the adapter performance are DEMNA team was to minimize CVAX performance, DMA the service time through engine throughput, and bus the adapter to maximize contention. throughput. This is most The gate array DMA engine critical for small packet can sustain between 11.5 sizes. If the service time and 13.5 megabytes per is greater than the time second on a VAX 6000 it takes to transmit or system. When transferring receive a packet, then packet data (and attendant queue depths increase, host ring processing), the increasing latency for firmware can sustain about subsequent packets. Small 5.8 megabytes per second. packets are critical This is the approximate because, obviously, they rate at which the firmware take less time to transmit would deliver a burst of or receive. large packets that had been The speed of the Ethernet stalled due to a lack of wire and the XMI bus must receive buffers. also be considered. The The CVAX chip used is the Ethernet operates at 60-microsecond variant (the 10 megabits per second. same one used in the VAX The available bandwidth 6000 Model 310 processor). into memory and the As seen in Figure 1, the capacity of the XMI are processor runs on its much greater; thus, the own internal CDAL bus Ethernet is the limiting which has RAM containing factor. To maintain maximum firmware and private throughput, the DEMNA data structures. Thus the device must write and read processor does not contend packets to and from host for the same bus as the memory at a speed equal gate array and the LANCE to or greater than the chip. However, the CVAX Ethernet wire. If this processor does touch shared speed is obtained, then memory and gate array the service time of the registers; therefore the DEMNA adapter must be less possibility of contention than the time it takes to is significant. Logic transmit or receive one 64- analyzer measurements byte (small) packet to or indicate that about 14 from the Ethernet wire to percent of CVAX cycles maintain maximum throughput are consumed while waiting at all packet sizes. for access to the shared Hardware memory bus for minimum size packets. For large packets the consumption is 33 percent, but the cycles needed are considerably less than the remainder. The effect on the gate 12 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter array accounts for part 450 bytes per packet for of the difference between a mix of DECnet, LAT, the speeds of 11.5 to 13.5 and cluster traffic. megabytes per second and Table 1 represents the of the 5.8 megabytes per throughput that the host second mentioned above. software can see, given Firmware sufficient host computes. These numbers show what Throughput is limited by might be expected. Virtual the Ethernet bandwidth addressing costs some for packet sizes greater performance, and receive than 88 bytes. The average filtering accounts for most packet size on Ethernet of the difference between is approximately 150 to transmit and receive. Table 1 DEMNA Throughput ___________________________________________________________________ Packet Transmit Length Ethernet LANCE Virtual Transmit Receive Receive (bytes)____Maximum____Maximum____(microseconPhysical___Virtual____Physical 64 14880 14662 13181 14633 12468 12918 72 13586 13404 12592 13361 12254 12830 80 12500 12345 12247 12340 11813 12227 88 11574 11441 11432 11438 11441 11441 96 10775 10660 10656 10658 10660 10660 112 9469 9380 9380 9380 9380 9380 128 8445 8374 8374 8374 8374 8374 256 4528 4508 4508 4508 4508 4508 512 2349 2344 2342 2344 2344 2344 1024 1197 1195 1195 1195 1195 1195 1518_______812________812________812________812________812________812 It is interesting to packets in virtual address look at the number of mode and increase slightly instructions executed by with increasing packet the CVAX processor for each sizes. receive and transmit packet For a transmit, the number as the measure of how much of instructions required work must be done for each was about 134, consisting packet. These instruction of 5 instructions for work counts are for minimum size Digital Technical Journal Vol. 3 No. 3 Summer 1991 13 Design of the DEC LANcontroller 400 Adapter done in the scheduler to determine initial contribution for any packet transmit context, 77 size is much smaller than instructions for the it would be if all the data transfer from host packet processing occurs memory, 18 instructions after the packet has been to get the LANCE chip to fully received. Thus the begin transmitting, and 34 time between the end of a instructions to process packet on the wire and the packet completion and host interrupt is fairly to update status in the constant from 64- to 1518- transmit ring entry in host byte packets, 50 to 70 memory. microseconds. For a receive, the number Reliability of instructions required was about 160, consisting Reliability, or probability of 5 instructions for work of loss, is measured by how done in the scheduler to large a burst of traffic determine initial receive the adapter can withstand context, 40 instructions at the maximum receive rate to deal with the LANCE and deliver these packets operations, 20 instructions to the host without losing for packet filtering, any. Adapter reliability 65 instructions for the was measured at various data transfer to host packet sizes. A burst of memory (including some 5 seconds without packet time spent finding a loss was considered to be user and validating the of "infinite" duration. packet length), and 30 Table 2 shows that instructions for the the DEMNA adapter can prefetch of the next survive a significant receive ring entry. burst of activity Some throughput was traded without packet loss. Such off in the interest of activity is unlikely, but reducing adapter-added possible, depending on latency. By processing the application being receive packets in groups run and on the network of 192 bytes, the latency configuration. Table 2 DEMNA Receive Burst Tolerance ___________________________________________________________________ Burst Burst Physi- Packet Length Burst Virtual Burst Virtual Physical cal (bytes)________(packets)______(microseconds)_(packets)______(microseconds) 64 3250 221661 3843 262106 72 5116 381677 11591 864741 14 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter Table 2 (Cont.) DEMNA Receive Burst Tolerance ___________________________________________________________________ Burst Burst Physi- Packet Length Burst Virtual Burst Virtual Physical cal (bytes)________(packets)______(microseconds)_(packets)______(microseconds) 80 9917 803321 Infinite Infinite 88_____________Infinite_______Infinite_______Infinite_______Infinite This testing does not and monitor facilities measure how host software were built into adapter performs buffer allocation firmware from the outset; for a user application or we knew that the visibility for the adapter as a whole. was crucial to adapter For the latter, the DEMNA debugging and verification adapter accounts for any and would later be helpful lack of buffering by the to users. host by not discarding a System Operation packet if a buffer is not immediately available. The console displays XMI Instead, it waits up to utilization as apportioned three seconds for the host among the XMI devices. This to supply a buffer. data comes from sampling done by the firmware of the Visibility "last XMI node active on the bus." From this, the A system user looking user can estimate total XMI at the operation of the utilization. network sees three areas of The console also displays complexity: the system buffer occupancy on the software, the network adapter for transmit and controller, and the receive, user configuration network. When everything as to protocol type and is working well, there characteristics, buffer is little need to look at availability counters, and any of these areas except host interrupt counters. perhaps to predict future This data indicates how operation (by extrapolating the system is running, network utilization or i.e., whether sufficient system usage) or to confirm buffers are allocated to that the system is indeed the device and to each running well. When the user of the device. These system is not running well, counters also indicate how visibility into these areas much attention the driver is crucial to understanding is paying to the adapter. what is wrong and how to For example, if the system correct it. The console is not tuned properly, the Digital Technical Journal Vol. 3 No. 3 Summer 1991 15 Design of the DEC LANcontroller 400 Adapter adapter may be generating The DEMNA device normally less than normal interrupts sees all packets on the (because queuing delays wire (excluding packets are affecting the system less than 64 bytes in operation). These queuing length [runt packets] and delays can be seen in collision fragments). When the firmware counters, looking at the adapter which monitor the depth operation through the of adapter queues and the console facility, the ability of the adapter user sees current network to give receives to the utilization and network host, i.e., buffering on error information. For the adapter has been used transmit errors, the to compensate for queuing console displays the number delays in the host. of errors and date and time Adapter Operation of the last occurrence. For receive errors, When the adapter is not the console displays malfunctioning, visibility the number of errors, into adapter utilization date and time, source is important. The console address, and protocol displays program counter type. Additionally, receive (PC) sampling results for errors that are not counted the firmware, showing how (because they do not pass busy the adapter is and receive filtering) are where time is being spent. displayed. For example, When looking at the I/O error information is subsystem as a whole, displayed for a node it is important to know generating packets with how much the adapter is CRC errors regardless of contributing to queuing the destination of these delays, buffer occupancy, packets. and added latency. This The console also provides adapter operation can be the command SHOW NETWORK to seen by looking at how busy display network utilization the adapter is and how many in node addresses and buffers it has outstanding. protocol types. For this For adapter failure or command, the receive problems on the XMI, the firmware calls a monitor console displays error facility routine for each information which has packet seen on the wire. been saved in EEPROM. This routine maintains This error data consists statistics for each source of fatal error context, and destination node data transfer or XMI error address, consisting of context, and results of the number of packets and self-test. the number of bytes. At Network Operation three-second intervals, the console calls a monitor routine which adds statistics over the prior interval to cumulative data 16 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter for each node, collects o Packet tracing. This top nodes and protocol function allowed a node data, and clears the to scan the receive interval data to prepare stream for packets with for the next three seconds selected source and of monitoring. Figure 3 destination addresses represents a sample network and protocol types. monitoring display. Either the packet header Debug Tools or the entire packet was saved for matching The monitor task provided packets. This function other debugging functions was used extensively during adapter debugging during initial debugging and internal field test. for validating transmit These functions are not functionality. Later it visible features in the was used for validation finished product. However, of MOP and related they are extensions to functionality by the functionality and creating trace files illustrate the benefits on a known good node. of visibility into the We then ran functional adapter. A user program, scripts through a test XNAMON, was written to generator, which used access the following the traffic generator functions. on one node to send a o Traffic generation. test packet to the node It is difficult to under test. The command generate heavy loads on and the response were an adapter, particularly traced by the trace node because of logistics. and the test program Other systems are needed collected the trace with enough processing data and compared it power to generate the against known good data. load. Using the XNAMON Packet tracing was also program, only one system used to verify packet was needed. XNAMON was filtering by devising a run on it to direct test program that could other adapters to start up particular generate traffic to user configurations and another node with a loop back any packets particular packet size received. at a specified rate. o Adapter test. The Since traffic generation ability to exercise a could be done regardless module under stress of system state (except was critical to adapter for power on), there was hardware verification. always a good supply of The functionality traffic generators. in question was the Ethernet subsystem and the XMI interface through the gate array. The monitor facility Digital Technical Journal Vol. 3 No. 3 Summer 1991 17 Design of the DEC LANcontroller 400 Adapter provided this test o The firmware can be functionality by doing changed easily (bug MOP loopback operations fixes or changes in to another node while functionality), thus doing various data reducing long-term transfer operations maintenance and support to host memory. Data costs. Also, changes compares were done on can be made in the field completed transactions by a firmware upgrade to validate data rather than requiring integrity. The XNAMON module rework at a program provided the manufacturing site. interface for this o By designing in the function and the remote firmware, designers can display of its results. avoid software driver o Remote debugger. complexity and the The access to DEMNA necessity of hardware internals allowed remote redesign. adapter memory dumps o The firmware can provide and remote inspection powerful debugging of data structures while mechanisms and tools. the adapter was running. o The firmware is very flexible. Changes Conclusion to support hardware The DEMNA adapter meets problems or additional the requirements of the VAX off-load of host 6000 and VAX 9000 systems. computes can be In fact, the performance considered late in for small packets exceeds the design cycle. This the capability of these may also allow new systems. For larger port architecture and packets, Ethernet bandwidth addressing changes for is the limiting factor. Our creating new products. experience illustrates o Firmware designs allow some advantages and extensive functionality disadvantages of choosing a for lower product and firmware-based design over development cost than a an interface implemented total hardware design. entirely in hardware. o Firmware designs allow Advantages of a Firmware- the hardware to be based Design released earlier in the The advantages of designing development cycle. an adapter in firmware are Disadvantages of a as follows: Firmware-based Design o The firmware can usually off-load host computes by doing more pre- processing. 18 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter The disadvantages of Increased host processor designing an adapter in speed moves the I/O firmware are: bottleneck from the host o The adapter is generally to the I/O subsystem. To more expensive, supply the I/O needs, the considering the cost I/O subsystem must provide of a microprocessor faster media, e.g., fiber subsystem with enough distributed data interface computes for the job. (FDDI) in the near term, or multiple connections o The adapter is slower to slower media (such as in terms of latency. Ethernet). The I/O adapters Some applications may will be expected to provide be more sensitive than significantly greater others, given the same throughput with a smaller throughput, but may have adapter contribution to slightly larger service latency. The effective times per packet. The performance of the system effect can be viewed will be more sensitive to in terms of buffer latency. For example, an occupancy: an adapter application using a single with lower latency may threaded command/response utilize, on average, few protocol is extremely buffers. dependent on the amount o The approach is of service time through the not feasible for I/O subsystem at each end. transmission media As the processing speed much faster than increases, application Ethernet, because the overhead is reduced performance requirements and throughput becomes of the microprocessor dominated by the service become extreme or the time of the adapter and the hardware assists for the transmission time. microprocessor become Faster processors too complex and costly. place a greater burden Future Directions on the system bus and Several characteristics I/O interface, which distinguish future necessitates a simpler anticipated system design bus protocol. This might from current systems (such consist of eliminating as the VAX 6000 and VAX costly functionality 9000 systems). such as byte masking and interlocks. However, a o Increased host processor simpler interface to the power I/O adapter will require o Simplified bus design considerable change to the port protocol to ensure its o Increased I/O bandwidth efficiency. requirements Digital Technical Journal Vol. 3 No. 3 Summer 1991 19 Design of the DEC LANcontroller 400 Adapter The characteristics needed length validation, and in future adapters are as maintaining counters follows: data. Improving packet o Greater throughput. This filtering by host means more connections software would eliminate to a slower medium, the reason for placing such as a single adapter this function on the supporting multiple adapter in the first Ethernet connections. place. Or it means a faster Filtering in host medium. Additionally, software is considerably configurations using more difficult than Ethernets as point-to- in the adapter. The point links will be more difficulty comes common, thus implying from the need to deal a heavier load on each with extreme user Ethernet. configurations. The o Simpler host interface. DEMNA is bounded by This is necessitated limiting the number by the simpler bus of users and node protocol. Bus overhead addresses. The extreme should be minimized, cases must still be done which includes the by host software. elimination of such functionality as page Acknowledgements table access for virtual The authors would like to address translation. acknowledge the following Also, the bus transfer members of the DEMNA design size used by the adapter team: Barbara Aichinger, should be compatible Keith Bilafer, Mark with the basic data size Cacciapouti, Don Dossa, of the system to avoid Linda Duffell, Bernie cache thrashing and Hall, Jeff Huber, Helen unnecessary read-modify- McGreal, Jonathan Mooty, write transactions. Dave O'Keefe, David Oliver, o Reduced latency. The Brian Parr, Art Singer, adapter should minimize Andy Stewart, Fred Templin, its contribution to Vicky Triolo, Ed Tulloch, transmit and receive and Don Villani. latency. This may mean reducing some of the General References functions done by an intelligent adapter DEC LANcontroller 400 on receive, in order Programmer's Guide to speed delivery to (Maynard: Digital Equipment the host after packet Corporation, Order No. reception is complete. EK-DEMNA-PG-001, 1990). These functions include packet filtering, handling of maintenance operations packets, 20 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of the DEC LANcontroller 400 Adapter D. Boggs et al., "Measured Capacity of an Ethernet: Myths and Reality," DEC LANcontroller 400 Proceedings of SIGCOMM Console User's Guide '88 (ACM SIGCOMM, 1988): (Maynard: Digital Equipment 222-234. Corporation, Order No. The Ethernet: A Local EK-DEMNA-UG-001, 1990). Area Network, Data Link D. Mirchandani and Layer and Physical Layer P. Biswas, "Ethernet Specifications, Version Performance of Remote 2.0 (Digital Equipment DECwindows Applications," Corporation, Intel Digital Technical Journal, Corporation, and Xerox vol. 2, no. 3 (Summer Corporation, Order No. 1990): 84-94. AA-K759B-TK, 1982). Digital Technical Journal Vol. 3 No. 3 Summer 1991 21 ============================================================================= Copyright 1991 Digital Equipment Corporation. Forwarding and copying of this article is permitted for personal and educational purposes without fee provided that Digital Equipment Corporation's copyright is retained with the article and that the content is not modified. This article is not to be distributed for commercial advantage. Abstracting with credit of Digital Equipment Corporation's authorship is permitted. All rights reserved. =============================================================================