This has to be somewhat vague for trade secret reasons, but during the mid-90's Polaroid gathered extensive data on network problems as part of a total system reliability tracking effort. The systems were medical printing systems, and data was gathered for video capture units, print servers, and printers. The networks were hospital networks, all with wired ethernet connections. There were several hundred devices involved and measurements taken at a couple hundred hospitals over an extended period at these hospitals. Some of the units were stationary, and some of the units were mobile.
The network traffic is noticably different from ATNA-Syslog, so these results would need some adjustment for that purpose. The network traffic for printing was DICOM, "lpr", and "ftp" traffic. The observations were made over a period of several years. In excess of 10 million transactions were recorded. Transactions were almost entirely print transactions, with a very few print status queries. The observations were:
- Network problems were entirely insignificant.
- Stationary systems reported no network problems. TCP/IP handled whatever happened just fine. (I should note that a down server or printer was not considered a network problem if this was detected before the print data was sent. The sending systems would queue their prints and wait for the server or printer to be restored to service. Down servers or printers were considered a different kind of reliability problem.)
- Mobile systems detected network problems at about one per 250,000 transactions. These were all the kind of failure that I described earlier. The application acks did help identify problems. At a problem rate of 4 ppm we decided it was reasonable to deal with these problems by just re-printing whenever a possible problem was observed. Four extra prints per million is an insignificant problem and not worth extensive engineering work.
ATNA Syslog is a different network use pattern, and the motivation for directed attacks is different. Hospital network characteristics may have changed since the 1990's, and this data is not a measurement of cross enterprise data traffic. This data is relevant in characterizing the internal characteristics of the hospital environment, and can be taken into consideration when doing reliabiliity design, FMEA, and similar analysis for ATNA product design and installation design.
ATNA + SYSLOG is good enough. I'm aware that there are calls for Healthcare to go away from SYSLOG and invent their own protocol with an application level acknowledgement.
Posted by: Cameron | February 02, 2012 at 08:08 AM