The 'Security Digest' Archives (TM)

Archive: About | Browse | Search | Contributions | Feedback
Site: Help | Index | Search | Contact | Notices | Changes

ARCHIVE: TCP-IP Distribution List - Archives (1986)
DOCUMENT: TCP-IP Distribution List for December 1986 (186 messages, 106384 bytes)
SOURCE: http://securitydigest.org/exec/display?f=tcp-ip/archive/1986/12.txt&t=text/plain
NOTICE: securitydigest.org recognises the rights of all third-party works.

START OF DOCUMENT

-----------[000000][next][prev][last][first]----------------------------------------------------
Date:      Mon, 1 Dec 86 21:25:58 pst
From:      David Cheriton <cheriton@pescadero.stanford.edu>
To:        tcp-ip@sri-nic.ARPA
Subject:   VMTP Checksum Discussion
Dear TCP-IPers:
  I have been working on a request-response transport protocol (VMTP) for
possible adoption in the Internet.  This work is being done as part of my
participation in the IAB End-to-end task force, chaired by Bob Braden.
  Bob has asked me to include this mailing list in on some discussion of
design choices for checksums. Here is my side of the story.

                  VMTP Checksum Controversy

There appears to be three basic issues re VMTP Checksum, or more generally
a checksum for the next generation of transport protocols, namely:
1) what is the size of the checksum field.
2) the algorithm should be used.
3) Where in the packet should the checksum be located.
I spent some time talking with John Gill and Marty Hellman  of the Stanford
Information Systems Lab. as well as thinking about it and some experimentation.
(I also got some good feedback from others in my group, including Steve
Deering, Michael Stumm, Ross Finlayson.)

1) Checksum Size

The checksum is a probabilistic device, clearly.  There are an enormous
number of packets that map to the same checksum for either 16 or 32 bits of
checksum.
Let R be the packet rate (on average) in packets per second.
Let P be the probability of a packet corrupted that is undetected
by the lower levels, per packet.
As long as P is non-zero, there is some time range T at which
T*R*P becomes quite large, i.e. we have to expect some packets
that are erroneous being delivered by the IP level. I'll refer to these
as nasty packets.
If we view the corruption as randomly assigning the packet a new checksum
from the space of possible checksums, it is clear that, with a 16-bit
checksum, there is a 1 in 65,536 chance that the corrupted packet will have
the same checksum.  With a 32-bit checksum, it is one in 4 billion.
The key ingredient we expect to dramatically change over the next 5 years
is the packet rate, for channels, switching nodes and gateways.
It is reasonable to expect a significant increase in traffic, number of
gateways as well as packet rate per gateway. Just to play with some numbers,
let us assume by 1991, 500 gateways each handling 200 per sec. with probability
10**-7 of packets with errors undetected by lower levels (i.e. combination
of all hardware and software from one endpoint to another.) means that
there are 36 packets per hour with undetected errors at lower level.
Assuming random mapping on to checksums, with 16 bit checksum, one might
expect a packet whose error is not detected by the transport-level checksum
every 75 days or so.  Maybe this is OK?  However, suppose these numbers are
low?  For instance, we occasionally gets bursts of packets at 2000/sec on
our Ethernet.  The future parameters and use of communication could easily
push the time between undetected errors down by an order of magnitude
(7 days?)

Of course, one could argue that such a packet may still fail the addressing
checks, the sequence number checks, etc. in the protocol.  However,
our measurements of VMTP indicate 1/2 as being minimal size packets and 1/4
as maximal size packets (for the Ethernet at least).  With small packets,
the probability of random mods to data versus header is .5 whereas for
for large packets it is .95 for the data.  Thus, this argument reduces the
risk by far less than a factor of two. Moreover, it is disquieting to fall
back on these random checks to help us when we expect the "real" error
detection mechanisms is going to fail.

One might also argue that my assumed error rate/per packet is too high.
However, the probability of correct operation is the PRODUCT of
the probabilities for the sequence of components involved in a packet
transmission and delivery - so this error rate could get large fast.
Moreover, one requires a more careful analysis to get a tighter bound.
Can anyone offer a significant improvement in bounds without making
some disconcerting assumptions??

In the same breath, one could argue that my analysis is far too optimistic.
For instance, if the checksum algorithm uniformly maps all possible
packets onto all possible checksums, then the fact that normal packets
may be dramatically skewed to particular types, such as ASCII text
packets to and from an important server, may mean that far fewer
corruptions will be detected.  In particular, a systematic hardware error
that corrupts a packet into a nasty packet (with correct checksum) is
far more likely to encounter the same packet (or similar enough)
than if random packets were being generated.

This "analysis" deals with pleasant, sunny day expected behavior.
A more frightening situation is dealing with burst errors from failing
components.  For instance, suppose a gateway or network interface fails
by curdling random portions of packets.  We could see 200 undetected
(below transport level) errors per second. Again, assuming random
mods, with a 16-bit checksum we can expect to get a corrupted packet
that passes the transport-level checksum every 5.5 minutes,
versus every 8.2 days with a 32-bit checksum.
If we consider a military situation where components may get wounded in action,
it is easy to conceive of several things generating garbage at the worst
possible time, given a very high exposure using a 16-bit checksum.
The 32-bit checksum appears to give a needed marginal of safety,
especially a variety of factors could cause one to effectively loss a
factor of 2 or more.  (As John Gill raised, are you sure 32-bits is enough?!)

Overall, after studying this issue in brief and thinking thru the above,
I am even more convinced we need to get to at least a 32-bit checksum.
The increased expected packet rate is the key thing that is making 16-bit
checksums unacceptable.

2) Checksum algorithm

There appears to be a deficiency of theory behind checksums that are suitable
for software implementation.  Calculating 32-bit CRC's is infeasible - either
too slow or requires an enormous lookup table.  The ideal algorithm is:
i) very fast to calculate and check in software.
ii) can be shown to detects common types of packet corruption including,
iii) can be shown to be unupdateable "thru" an encryption scheme,
  I.e. intruder cannot update (encrypted) checksum field to compensate for
   modifications to other encrypted data.

Lots of schemes can be proposed but the more complex, the more difficult it is
to analyze with respect to (ii) and (iii).
The current algorithm, assuming we extend it to a 32-bit version
does not detect appending or inserting 0's in a packet
or the reordering of 16-bit (or 32-bit) units. These seem like not unlikely
failures in a large internetwork with gateways or different byte orders
and flakey hardware, etc.

Another candidate is "add-and-rotate" as used by RDP, plus suggested
(without guarantee) by John Gill. It also has the property that any single
32-bit word mod cannot leave the checksum calculated correctly (also true of
the TCP algorithm). 
The "rotate" provides some detection of zeros and reordering.  However,
clearly there are occasions that it will fail to detect reordering of
32 long word blocks, or insertion of 32 long blocks of zeros.
(It also requires some assembly language in most languages for efficient
implementation.) Other properties are not clear to me.

Another algorithm is the ISO transport algorithm, which is easily extended
to 32-bits or 64-bits. I.e. R = Sum(1..L) Wi and S = Sum(1..L)(L-i+1)Wi
where Wi is the ith (32-bit) word and there are L bits in the packet.
According to Fletcher's analysis (See ref. below.), this algorithm
detects all single bit errors and detects all but .001526 percent of all
possible errors, same as CRC.  It also detects all burst errors of length
C bits, where C is the number of bits in the checksum. Fletcher claims:
"it represents a very reasonable tradeoff between effectiveness and efficiency".

I did some basic performance tests on these algorithms.
The following gives the performance for the three algorithms for checksumming
a VMTP packet with 1024 bytes of data (assumed to be long word aligned).

Algorithm             MC 68010 (10 MHz)    MC 68020 (16 MHz)
add-and-rotate              916 usec                 320 usec.
ISO Transport (32-bit)      914 usec                 336 usec.
ISO (16-bit)               1245 usec                 446 usec.
TCP                         890 usec                 323 usec.

The TCP figure is really not too accurate - it lacks a test for carry on
each 32-bit add. The first ISO version uses 32-bit adds to develop the two sums
using 2's complement arithmetic (and then I just store the sum of the two
sums - a proper scheme here would require 64-bits for the checksum field.)
(Fletcher notes that the 1's complement version has slightly better properties
but is more expensive to compute.)
The second one uses 16-bit adds, which means a register swap is necessary as
well as doubling the number of adds required. The results here appear to
contradict Jon Crowcroft's comments on performance.  Perhaps his factor of
4 comes from a poor implementation??
The ISO 32-bit version and the TCP version are in (portable) non-asm C,
whereas the others are done in assembly language. An assembly version
of ISO 32-bit appears about 80 usecs faster on a 68020.

All these algorithms have the property of requiring only one
memory reference (to get the 32-bit byte out of the packet) per word
in the packet, which is the minimal memory reference cost, given 32-bit
data paths.  The numbers seem to indicate that a small number of instructions
per long word (that do not increase # memory references) make no significant
difference in performance of the algorithm.  This is particularly true on
machines like the 68020, where the inner loop fits into the on-chip
instruction cache, and this is the future. For instance, the ISO and TCP
versions differ by essentially one addition, and thus by 13 usecs
(for 1108 register-register adds!)

Thus, I argue that there is no significant performance penalty to adding a
few non-memory reference instructions for a rotate or extra addition
to get better detection capabilities.

3) Location of the checksum

I propose to place the VMTP checksum into a trailer portion of the packet.
This allows a high-performance network interface, such as the one we are
developing here at Stanford, to add the checksum as the packet is transmitted
to the network, avoiding 2 extra memory references.  It does not appear to
add any additional software overhead without such an interface, unless one
is trying to use a relatively simple DMA net interface and DMA packets
on transmission and reception without copy.  Since this is in general not
feasible without scatter-gather, and with, the trailer presents no problem,
I dont think this is a real disadvantage.

My vision of the future is of network interfaces that implement much or
all of the transport level, providing a service similar to the disk interface.
I.e. one does not generally worry in software about errors between main memory
and the disk.  Note that the error checking is still end-to-end at least
between network interfaces in the end machines.  Also, with request-response
protocols, the response is an application-level end-to-end acknowledgement of
the request.

For these interfaces, performance is determined by the number of memory
references.  The next generation of network interfaces must minimize
the number of memory references to transmit and receive packets or they
will be slow.  In fact, one of the big challenges (the only one?) is to
implement "intelligence" on the network interface without being killed
by the delay of extra copies (i.e. memory references.)
  Clearly, the value of the checksum is not known until the entire packet
has been scanned.  Putting the checksum at the end allows the checksum
to be calculated as part of transmitting the packet from the network
interface to the network.  The only other copy into which it might be
incorporated, between the host and the network interface, has several
problems.  First, the data is not necessarily packetized at that point,
and definitely not in our network interface design.  Second, the data rate
on that path can be so high (and must be in multiprocessors) that fitting
a checksum calculation into the copy is difficult.  Moreover, one may
want to do encryption at the same time.   For instance, we move data
at 320 megabits/sec. in our design for the VMEbus.  Much higher speeds
are planned.

Currently, we are dealing with very "1st" generation network interfaces
that are dealing with multi-megabit data rates.  I think the next generation
can be much more reliable, efficient, and powerful.  We need to design
the transport protocol such that it does not introduce unnecessary
difficulties for such interfaces.
Thus, I feel this minor modification to the protocol supports the right way
of doing network interfaces, and that it aids these interfaces in dealing
with an intrinsic problem, not one that will go away with a few more bells
and whistles on the interfaces.

One issue a trailer checksum raises is alignment relative to encryption.
I have tried to keep the header as a multiple of 64 bits.
With a null data segment and 32-bit checksum, the VMTP packet is still a
multiple of 64 bits.  If we force the datasegment to also be a multiple
of 64 bits, this all remains true still.  This is what I propose to do.

In summary, I recommend we go for a 32-bit checksum and use the ISO
checksum algorithm, modified to use 32-bit units.
The ISO checksum is a standard and has an article analyzing its
properties (to some extent). (If we also start the checksum
using a non-zero initial value, it should do well at detecting a packet
of all zeroes.) Finally, we place the checksum in a single 32-bit trailer
portion of the packet.

Comments?

David Cheriton

Ref>

J.G. Fletcher, An Arithmetic Checksum for Serial Transmission,
IEEE Trans. on Com. COM-30 (1) 247-251, Jan, 1982.
-----------[000001][next][prev][last][first]----------------------------------------------------
Date:      Tue, 2-Dec-86 00:25:58 EST
From:      cheriton@PESCADERO.STANFORD.EDU (David Cheriton)
To:        mod.protocols.tcp-ip
Subject:   VMTP Checksum Discussion

Dear TCP-IPers:
  I have been working on a request-response transport protocol (VMTP) for
possible adoption in the Internet.  This work is being done as part of my
participation in the IAB End-to-end task force, chaired by Bob Braden.
  Bob has asked me to include this mailing list in on some discussion of
design choices for checksums. Here is my side of the story.

                  VMTP Checksum Controversy

There appears to be three basic issues re VMTP Checksum, or more generally
a checksum for the next generation of transport protocols, namely:
1) what is the size of the checksum field.
2) the algorithm should be used.
3) Where in the packet should the checksum be located.
I spent some time talking with John Gill and Marty Hellman  of the Stanford
Information Systems Lab. as well as thinking about it and some experimentation.
(I also got some good feedback from others in my group, including Steve
Deering, Michael Stumm, Ross Finlayson.)

1) Checksum Size

The checksum is a probabilistic device, clearly.  There are an enormous
number of packets that map to the same checksum for either 16 or 32 bits of
checksum.
Let R be the packet rate (on average) in packets per second.
Let P be the probability of a packet corrupted that is undetected
by the lower levels, per packet.
As long as P is non-zero, there is some time range T at which
T*R*P becomes quite large, i.e. we have to expect some packets
that are erroneous being delivered by the IP level. I'll refer to these
as nasty packets.
If we view the corruption as randomly assigning the packet a new checksum
from the space of possible checksums, it is clear that, with a 16-bit
checksum, there is a 1 in 65,536 chance that the corrupted packet will have
the same checksum.  With a 32-bit checksum, it is one in 4 billion.
The key ingredient we expect to dramatically change over the next 5 years
is the packet rate, for channels, switching nodes and gateways.
It is reasonable to expect a significant increase in traffic, number of
gateways as well as packet rate per gateway. Just to play with some numbers,
let us assume by 1991, 500 gateways each handling 200 per sec. with probability
10**-7 of packets with errors undetected by lower levels (i.e. combination
of all hardware and software from one endpoint to another.) means that
there are 36 packets per hour with undetected errors at lower level.
Assuming random mapping on to checksums, with 16 bit checksum, one might
expect a packet whose error is not detected by the transport-level checksum
every 75 days or so.  Maybe this is OK?  However, suppose these numbers are
low?  For instance, we occasionally gets bursts of packets at 2000/sec on
our Ethernet.  The future parameters and use of communication could easily
push the time between undetected errors down by an order of magnitude
(7 days?)

Of course, one could argue that such a packet may still fail the addressing
checks, the sequence number checks, etc. in the protocol.  However,
our measurements of VMTP indicate 1/2 as being minimal size packets and 1/4
as maximal size packets (for the Ethernet at least).  With small packets,
the probability of random mods to data versus header is .5 whereas for
for large packets it is .95 for the data.  Thus, this argument reduces the
risk by far less than a factor of two. Moreover, it is disquieting to fall
back on these random checks to help us when we expect the "real" error
detection mechanisms is going to fail.

One might also argue that my assumed error rate/per packet is too high.
However, the probability of correct operation is the PRODUCT of
the probabilities for the sequence of components involved in a packet
transmission and delivery - so this error rate could get large fast.
Moreover, one requires a more careful analysis to get a tighter bound.
Can anyone offer a significant improvement in bounds without making
some disconcerting assumptions??

In the same breath, one could argue that my analysis is far too optimistic.
For instance, if the checksum algorithm uniformly maps all possible
packets onto all possible checksums, then the fact that normal packets
may be dramatically skewed to particular types, such as ASCII text
packets to and from an important server, may mean that far fewer
corruptions will be detected.  In particular, a systematic hardware error
that corrupts a packet into a nasty packet (with correct checksum) is
far more likely to encounter the same packet (or similar enough)
than if random packets were being generated.

This "analysis" deals with pleasant, sunny day expected behavior.
A more frightening situation is dealing with burst errors from failing
components.  For instance, suppose a gateway or network interface fails
by curdling random portions of packets.  We could see 200 undetected
(below transport level) errors per second. Again, assuming random
mods, with a 16-bit checksum we can expect to get a corrupted packet
that passes the transport-level checksum every 5.5 minutes,
versus every 8.2 days with a 32-bit checksum.
If we consider a military situation where components may get wounded in action,
it is easy to conceive of several things generating garbage at the worst
possible time, given a very high exposure using a 16-bit checksum.
The 32-bit checksum appears to give a needed marginal of safety,
especially a variety of factors could cause one to effectively loss a
factor of 2 or more.  (As John Gill raised, are you sure 32-bits is enough?!)

Overall, after studying this issue in brief and thinking thru the above,
I am even more convinced we need to get to at least a 32-bit checksum.
The increased expected packet rate is the key thing that is making 16-bit
checksums unacceptable.

2) Checksum algorithm

There appears to be a deficiency of theory behind checksums that are suitable
for software implementation.  Calculating 32-bit CRC's is infeasible - either
too slow or requires an enormous lookup table.  The ideal algorithm is:
i) very fast to calculate and check in software.
ii) can be shown to detects common types of packet corruption including,
iii) can be shown to be unupdateable "thru" an encryption scheme,
  I.e. intruder cannot update (encrypted) checksum field to compensate for
   modifications to other encrypted data.

Lots of schemes can be proposed but the more complex, the more difficult it is
to analyze with respect to (ii) and (iii).
The current algorithm, assuming we extend it to a 32-bit version
does not detect appending or inserting 0's in a packet
or the reordering of 16-bit (or 32-bit) units. These seem like not unlikely
failures in a large internetwork with gateways or different byte orders
and flakey hardware, etc.

Another candidate is "add-and-rotate" as used by RDP, plus suggested
(without guarantee) by John Gill. It also has the property that any single
32-bit word mod cannot leave the checksum calculated correctly (also true of
the TCP algorithm). 
The "rotate" provides some detection of zeros and reordering.  However,
clearly there are occasions that it will fail to detect reordering of
32 long word blocks, or insertion of 32 long blocks of zeros.
(It also requires some assembly language in most languages for efficient
implementation.) Other properties are not clear to me.

Another algorithm is the ISO transport algorithm, which is easily extended
to 32-bits or 64-bits. I.e. R = Sum(1..L) Wi and S = Sum(1..L)(L-i+1)Wi
where Wi is the ith (32-bit) word and there are L bits in the packet.
According to Fletcher's analysis (See ref. below.), this algorithm
detects all single bit errors and detects all but .001526 percent of all
possible errors, same as CRC.  It also detects all burst errors of length
C bits, where C is the number of bits in the checksum. Fletcher claims:
"it represents a very reasonable tradeoff between effectiveness and efficiency".

I did some basic performance tests on these algorithms.
The following gives the performance for the three algorithms for checksumming
a VMTP packet with 1024 bytes of data (assumed to be long word aligned).

Algorithm             MC 68010 (10 MHz)    MC 68020 (16 MHz)
add-and-rotate              916 usec                 320 usec.
ISO Transport (32-bit)      914 usec                 336 usec.
ISO (16-bit)               1245 usec                 446 usec.
TCP                         890 usec                 323 usec.

The TCP figure is really not too accurate - it lacks a test for carry on
each 32-bit add. The first ISO version uses 32-bit adds to develop the two sums
using 2's complement arithmetic (and then I just store the sum of the two
sums - a proper scheme here would require 64-bits for the checksum field.)
(Fletcher notes that the 1's complement version has slightly better properties
but is more expensive to compute.)
The second one uses 16-bit adds, which means a register swap is necessary as
well as doubling the number of adds required. The results here appear to
contradict Jon Crowcroft's comments on performance.  Perhaps his factor of
4 comes from a poor implementation??
The ISO 32-bit version and the TCP version are in (portable) non-asm C,
whereas the others are done in assembly language. An assembly version
of ISO 32-bit appears about 80 usecs faster on a 68020.

All these algorithms have the property of requiring only one
memory reference (to get the 32-bit byte out of the packet) per word
in the packet, which is the minimal memory reference cost, given 32-bit
data paths.  The numbers seem to indicate that a small number of instructions
per long word (that do not increase # memory references) make no significant
difference in performance of the algorithm.  This is particularly true on
machines like the 68020, where the inner loop fits into the on-chip
instruction cache, and this is the future. For instance, the ISO and TCP
versions differ by essentially one addition, and thus by 13 usecs
(for 1108 register-register adds!)

Thus, I argue that there is no significant performance penalty to adding a
few non-memory reference instructions for a rotate or extra addition
to get better detection capabilities.

3) Location of the checksum

I propose to place the VMTP checksum into a trailer portion of the packet.
This allows a high-performance network interface, such as the one we are
developing here at Stanford, to add the checksum as the packet is transmitted
to the network, avoiding 2 extra memory references.  It does not appear to
add any additional software overhead without such an interface, unless one
is trying to use a relatively simple DMA net interface and DMA packets
on transmission and reception without copy.  Since this is in general not
feasible without scatter-gather, and with, the trailer presents no problem,
I dont think this is a real disadvantage.

My vision of the future is of network interfaces that implement much or
all of the transport level, providing a service similar to the disk interface.
I.e. one does not generally worry in software about errors between main memory
and the disk.  Note that the error checking is still end-to-end at least
between network interfaces in the end machines.  Also, with request-response
protocols, the response is an application-level end-to-end acknowledgement of
the request.

For these interfaces, performance is determined by the number of memory
references.  The next generation of network interfaces must minimize
the number of memory references to transmit and receive packets or they
will be slow.  In fact, one of the big challenges (the only one?) is to
implement "intelligence" on the network interface without being killed
by the delay of extra copies (i.e. memory references.)
  Clearly, the value of the checksum is not known until the entire packet
has been scanned.  Putting the checksum at the end allows the checksum
to be calculated as part of transmitting the packet from the network
interface to the network.  The only other copy into which it might be
incorporated, between the host and the network interface, has several
problems.  First, the data is not necessarily packetized at that point,
and definitely not in our network interface design.  Second, the data rate
on that path can be so high (and must be in multiprocessors) that fitting
a checksum calculation into the copy is difficult.  Moreover, one may
want to do encryption at the same time.   For instance, we move data
at 320 megabits/sec. in our design for the VMEbus.  Much higher speeds
are planned.

Currently, we are dealing with very "1st" generation network interfaces
that are dealing with multi-megabit data rates.  I think the next generation
can be much more reliable, efficient, and powerful.  We need to design
the transport protocol such that it does not introduce unnecessary
difficulties for such interfaces.
Thus, I feel this minor modification to the protocol supports the right way
of doing network interfaces, and that it aids these interfaces in dealing
with an intrinsic problem, not one that will go away with a few more bells
and whistles on the interfaces.

One issue a trailer checksum raises is alignment relative to encryption.
I have tried to keep the header as a multiple of 64 bits.
With a null data segment and 32-bit checksum, the VMTP packet is still a
multiple of 64 bits.  If we force the datasegment to also be a multiple
of 64 bits, this all remains true still.  This is what I propose to do.

In summary, I recommend we go for a 32-bit checksum and use the ISO
checksum algorithm, modified to use 32-bit units.
The ISO checksum is a standard and has an article analyzing its
properties (to some extent). (If we also start the checksum
using a non-zero initial value, it should do well at detecting a packet
of all zeroes.) Finally, we place the checksum in a single 32-bit trailer
portion of the packet.

Comments?

David Cheriton

Ref>

J.G. Fletcher, An Arithmetic Checksum for Serial Transmission,
IEEE Trans. on Com. COM-30 (1) 247-251, Jan, 1982.

-----------[000002][next][prev][last][first]----------------------------------------------------
Date:      Tue, 2-Dec-86 04:54:02 EST
From:      juke@seismo.CSS.GOV@tutctl.UUCP
To:        mod.protocols.tcp-ip
Subject:   Symbolics-VAX/Sun -connection?

I'm mailing this for a friend of mine:

--------------------------------------------------------------------------

Information wanted

How to implement applications that use Symbolics and other computers via
ethernet using TCP/IP or Decnet protocols.

We has need to join together numerical application software that runs on
VAXes and Suns with our interface experiments that are running on Symbolics
workstation.  We should transform information about results and control
processes via ethernet.  Both TCP/IP and Decnet drivers are available on
both ends.

It seems that driver described above is not impossible to implement but we
don't want to reinvent wheel.

		
		Jarmo Salmela
		Tampere University of Technology
		P.O.Box 527
		SF-33101 Tampere
		Finland

--------------------------------------------------------------------------

You can send your replies also to the following address:

		UUCP: ...!mcvax!tutctl!juke

-----------[000003][next][prev][last][first]----------------------------------------------------
Date:      Fri,  5-DEC-1986 10:46 EST
From:      RANDY MARCHANY  <MARCHANYRC%VTVAX3.BITNET@WISCVM.WISC.EDU>
To:        <tcp-ip@sri-nic.arpa>
Subject:   IBM VM and DEC VMS TCP-IP question
We have over 10 VAX systems and an IBM 3090 model 200. The majority of
the Vaxes run VMS 4.4 (although we have one running Ultrix) and the IBM
systems run VM/SP HPO 4.2 with CMS release 4. We are looking at some
software/hardware products that will let us establish a TCP/IP network
here at Va. Tech. We are planning to connect the 3090 to SURANET and
eventually we'll do the same for our Vax systems.
        I am looking at the WISCNET package and at Fibronics (formerly
Spartacus?) KNET 200. I am very little info on the Wollongong software
for the Vax systems. I would like to hear from anyone who has had
any experience with any of these packages. Any information on reliability,
vendor support, installation problems, your opinion of the products, etc.
would be greatly appreciated. You can respond to this list or to me
directly. My name and address is:
        Randy Marchany
        Vax Systems Manager
        Va. Tech Computing Center
        116 Burruss Hall
        Blacksburg, VA 24061            703-961-6327
BITNET: MARCHANYRC@VTVAX3, MARCHANYRC@VTVAX5, HAND2@VTVM1

Also, if there are any other products out there that will perform the
same functions, I'd like to hear about them. Thanks again.
-----------[000004][next][prev][last][first]----------------------------------------------------
Date:      Fri, 5-Dec-86 14:02:29 EST
From:      MARCHANYRC@VTVAX3.BITNET (RANDY MARCHANY)
To:        mod.protocols.tcp-ip
Subject:   IBM VM and DEC VMS TCP-IP question

We have over 10 VAX systems and an IBM 3090 model 200. The majority of
the Vaxes run VMS 4.4 (although we have one running Ultrix) and the IBM
systems run VM/SP HPO 4.2 with CMS release 4. We are looking at some
software/hardware products that will let us establish a TCP/IP network
here at Va. Tech. We are planning to connect the 3090 to SURANET and
eventually we'll do the same for our Vax systems.
        I am looking at the WISCNET package and at Fibronics (formerly
Spartacus?) KNET 200. I am very little info on the Wollongong software
for the Vax systems. I would like to hear from anyone who has had
any experience with any of these packages. Any information on reliability,
vendor support, installation problems, your opinion of the products, etc.
would be greatly appreciated. You can respond to this list or to me
directly. My name and address is:
        Randy Marchany
        Vax Systems Manager
        Va. Tech Computing Center
        116 Burruss Hall
        Blacksburg, VA 24061            703-961-6327
BITNET: MARCHANYRC@VTVAX3, MARCHANYRC@VTVAX5, HAND2@VTVM1

Also, if there are any other products out there that will perform the
same functions, I'd like to hear about them. Thanks again.

-----------[000005][next][prev][last][first]----------------------------------------------------
Date:      Fri, 5-Dec-86 17:40:00 EST
From:      bjp%ulana@MITRE-BEDFORD.ARPA (Barbara J. Pease)
To:        mod.protocols.tcp-ip
Subject:   ULANA STATEMENT OF WORK AND SYSTEM SPECIFICATIONS NOW AVAILABLE


     This note is to announce that a version of the ULANA Statement of Work
(the SOW) and a new version of the ULANA System Specifications are now
available in the guest account at mitre-b-ulana.arpa.  The following informationis should be enough to allow people to get a copy.

      1. use ftp
      2. user is guest
      3. password is anonymous
      4. pathname for sow is /usr/mitre/guest/ulana.sow
      5. pathname for the spec is /usr/mitre/guest/ulana.spec
      6. internet number for mitre-b-ulana.arpa is 192.12.120.30
      7. send any mail to bjp@mitre-b-ulana.arpa


					Sincerely,
					bj Pease

-----------[000006][next][prev][last][first]----------------------------------------------------
Date:      Sat, 06 Dec 86 18:19:44 -0500
From:      Craig Partridge <craig@loki.bbn.com>
To:        tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   4.2/4.3 TCP and long RTTs

    I'm in the midst of doing comparisons between an RDP implementation and
the 4.2/4.3 TCP implementations and have run into a problem which I'm hoping
someone else can shed light on.

    I'm running tests on two machines, a VAX 750 running 4.3 and a SUN
workstation running 4.2.  The two machines are on the same Ethernet
and use the same gateway.  If I set up an experiment to test behaviour over
paths with long network delays (for example, bouncing packets off
Goonhilly), the TCP connections are established and then typically
fail part way through the transfer.  I don't understand this because
the RDP connections work just fine, and typically complete in 1/4 the
time it takes for a TCP connection to send about 20% of the data and
faint.  The experiment generally involves passing 50-100 segments of anywhere
from 64 to 1024 bytes to the protocols to send.  This is on weekends so
the delays aren't that long.

    The question I'm trying to answer is whether the problem is in the
RDP implementation (what anti-social things could it be doing to maintain
that connection?), or the TCP implementation (what might it be doing wrong
to die where another implementation succeeds?).  If I can, I'd like to
discourage invective.  I'm simply trying to figure out why this is happening
so I can identify and fix the problem and do a comparison between the two
implementations/protocols.  (And soon -- hair pulling over this problem
is beginning to threaten the health of my scalp and beard).

    General information on the RDP implementation:  it will retransmit
up to 10 times and calculates the round-trip time based on the first
packet sent with the caveat that it ignores round-trip times of segments
with sequence numbers lower than those of segments whose round-trip time
has already been computed (this feature is an experiment which may not
stay).  The maximum RTT is 2 minutes, the minimum is 2 seconds.

Craig
-----------[000007][next][prev][last][first]----------------------------------------------------
Date:      Sat, 6-Dec-86 20:38:02 EST
From:      craig@LOKI.BBN.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   4.2/4.3 TCP and long RTTs


    I'm in the midst of doing comparisons between an RDP implementation and
the 4.2/4.3 TCP implementations and have run into a problem which I'm hoping
someone else can shed light on.

    I'm running tests on two machines, a VAX 750 running 4.3 and a SUN
workstation running 4.2.  The two machines are on the same Ethernet
and use the same gateway.  If I set up an experiment to test behaviour over
paths with long network delays (for example, bouncing packets off
Goonhilly), the TCP connections are established and then typically
fail part way through the transfer.  I don't understand this because
the RDP connections work just fine, and typically complete in 1/4 the
time it takes for a TCP connection to send about 20% of the data and
faint.  The experiment generally involves passing 50-100 segments of anywhere
from 64 to 1024 bytes to the protocols to send.  This is on weekends so
the delays aren't that long.

    The question I'm trying to answer is whether the problem is in the
RDP implementation (what anti-social things could it be doing to maintain
that connection?), or the TCP implementation (what might it be doing wrong
to die where another implementation succeeds?).  If I can, I'd like to
discourage invective.  I'm simply trying to figure out why this is happening
so I can identify and fix the problem and do a comparison between the two
implementations/protocols.  (And soon -- hair pulling over this problem
is beginning to threaten the health of my scalp and beard).

    General information on the RDP implementation:  it will retransmit
up to 10 times and calculates the round-trip time based on the first
packet sent with the caveat that it ignores round-trip times of segments
with sequence numbers lower than those of segments whose round-trip time
has already been computed (this feature is an experiment which may not
stay).  The maximum RTT is 2 minutes, the minimum is 2 seconds.

Craig

-----------[000008][next][prev][last][first]----------------------------------------------------
Date:      Sat, 6-Dec-86 22:20:09 EST
From:      hedrick@TOPAZ.RUTGERS.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  4.2/4.3 TCP and long RTTs

I suspect that what you are seeing is that fact that Sun TCP (to be
fair, most 4.2 based TCP's) doesn't perform well with bad connections.
A number of sites have replaced Sun's TCP modules with their 4.3
equivalents.  This makes a dramatic difference in dealing with
difficult cases.

-----------[000009][next][prev][last][first]----------------------------------------------------
Date:      Sun, 07 Dec 86 09:15:07 -0500
From:      Craig Partridge <craig@loki.bbn.com>
To:        hedrick%topaz.rutgers.edu@sh.cs.net
Cc:        tcp-ip%sri-nic.arpa@loki.bbn.com
Subject:   clarification

> I suspect that what you are seeing is that fact that Sun TCP (to be
> fair, most 4.2 based TCP's) doesn't perform well with bad connections.
> A number of sites have replaced Sun's TCP modules with their 4.3
> equivalents.  This makes a dramatic difference in dealing with
> difficult cases.

    Re-reading my note I realize it isn't clear.  I'm seeing the behaviour
under both 4.2 and 4.3 (though I haven't tested the 4.3 implementation
as extensively).  I did the port to 4.3 for precisely the reasons you
mention and now find myself stymied.

Craig
-----------[000010][next][prev][last][first]----------------------------------------------------
Date:      Sun, 7-Dec-86 08:44:03 EST
From:      van@LBL-CSAM.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: 4.2/4.3 TCP and long RTTs

What you observe is probably poor tcp behavior, not antisocial rdp
behavior.  If the link is lossy or the mean round trip time is 
greater than 15 seconds, the 4.3bsd tcp throughput degrades rapidly.
For long transfers, a link that gives 2.7KB/s throughput with a
1% loss rate, gives 0.07KB/s throughput with a 10% loss rate.  (As
appalling as this looks, 4.2bsd, TOPS-20 tcp, and some other major
implementations that I've measured, get worse faster.  The 4.3
behavior was the best of everything I looked at.)

I know some of the reasons for the degradation.  As one might
expect, the failure seems to be due to the cumulative effect of
a number of small things.  Here's a list, in roughly the order
that they might bear on your experiment.

1. There is a kernel bug that causes IP fragments to be generated
   and ip fragments have only a 7.5s TTL.

In the distribution 4.3bsd, there is a bug in the routine
in_localaddr that makes it say all addresses are "local".  In
most cases, this makes tcp use a 1k mss which results in a lot of
ip fragmentation.  On high loss or long delay circuits, a lot of
the tcp traffic gets timed out and discarded at the destination's
ip level. 

The bug fix is to change the line:
	if (net == subnetsarelocal ? ia->ia_net : ia->ia_subnet)
in netinet/in.c to
	if (net == (subnetsarelocal ? ia->ia_net : ia->ia_subnet))

I also changed IPFRAGTTL in ip.h to two minutes (from 15 to 240)
because we have more memory than net bandwidth.


2. The retransmit timer is clamped at 30s.

The 4.3 tcp was put together before the arpanet went to hell and
has some optimistic assumptions about time.  Since the retransmit
timer is set to 2 * RTT, an RTT > 15s is treated as 15s.  (Last
week, the mean daytime rtt from LBL to UCB was 17s.) On a circuit
with 2min rtt, most packets would be transmitted four times and
the protocol pipelining would be effectively turned off (if 4.3
is retransmitting, it only sends one segment rather than filling
the window).  When running in this mode, you're very sensitive to
loss since each dropped packet or ack effectively uses up 4 of
your 12 retries. 

I would at least change TCPTV_MAX in netinet/tcp_timer.h to a
more realistic value, say 5 minutes (remembering to adjust
related timers like MSL proportionally).  I changed the
TCPT_RANGESET macro to ignore the maximum value because I
couldn't see any justification for a clamp.


3. It takes a long time for tcp to learn the rtt.  

I've harped on this before.  With the default 4k socket buffers
and a 512 byte mss, 4.3 tcp will only try to measure the rtt of
every 8th packet.  It will get a measurement only if that packet
and its 7 predecessors are transmitted and acked without error. 
Based on trpt trace data, tcp gets the rtt of only one in every
80 packets on a link with a 5% drop rate.  Then, because of the
gross filtering suggested in rfc793, only 10% of the new
measurement is used.  For a 15s rtt, this means it takes at least
400 packets to get the estimate from the default 3s to 7.5s
(where you stop doing unnecessary retransmits for segments with
average delay) and 1700 packets to get the estimate to 14s (where
you stop unnecessary retransmits because of variance in the
delay).  Also, if the minimum delay is greater than 6s
(2*TCPTV_SRTTDFLT), tcp can never learn the rtt because there
will always be a retransmit canceling with the measurement. 

There are several things we want to try to improve this
situation.  I won't suggest anything until we've done some
experiments.  But, the problem becomes easier to live with if
you pick a larger value for TCPTV_SRTTDFLT, say 6s, and improve
the transient response in the srtt filter (lower TCP_ALPHA to,
say, .7).


4. The retransmit backoff is wimpy.

Given that most of the links are congested and exhibit a lot of
variance in delay, you would like the retransmit timer to back
off pretty aggressively, particularly given the lousy rtt
estimates.  4.3 backs of linearly most of the time.  The actual
intervals, in units of 2*rtt, are:
  1  1  2  4  6  8  10  15  30  30  30 ...
While this is only linear up to 10, the 30s clamp on timers
means you never back off as far as 10 if the mean rtt is >1.5s.
The effect of this slow backoff is to use up a lot of your 
potential retries early in a service interruption.  E.g., a
2 minute outage when you think the rtt is 3s will cost you 9
of your 12 retries.  If the outage happens while you were
trying to retransmit, you probably won't survive it.

This is another area where we want to do some experiments.  It
seems to me that you want to back off aggressively early on, say
 1 4 8 16 ...
for the first part of the table.  It also seems like you want
to go linear or constant at some point, waiting 8192*rtt for the
12th retry has to be pointless.  The dynamic range depends to
some extent on how good your rtt estimator is and on how robust
the retransmit part of your tcp code is.  Also, based on some
modelling of gateway congestion that I did recently, you don't 
want the retransmit time to be deterministic.  Our first cut
here will probably look a lot like the backoff on an ethernet.


5. "keepalive" ignores rtt.

If you are setting SO_KEEPALIVE on any of your sockets, the 
connection will be aborted if there's no inbound packet for
6 minutes (TCPTV_MAXIDLE).  With a 2m rtt, that could happen
in the worst case with one dropped packet followed by one
dropped ack.  ("Sendmail" sets keepalive and we were having
a lot of problems with this when we first brought up 4.3.)

A fix is to multiply by t_srtt when setting the keepalive
timer and divide t_idle by t_srtt when comparing against
MAXIDLE.


6. The initial retransmit of a dropped segment happens, at
   best, after 3*rtt rather than 2*rtt.

If the delay is large compared to the window, the steady state
traffic looks like a burst acks interleaved with data, an ~rtt
delay, a burst of acks interleaved with data and repeat.  4.3
doesn't time individual segments.  It starts a 2*rtt timer for
the first segment, then, when the first segment is acked,
restarts the timer at 2*rtt to time the next segment.  Since the
2nd segment went out at approximately the same time as the first
and since the ack for the first segment took rtt to come back,
the retransmit time for the 2nd segment is 3*rtt.  In the usual
internet case of 4k windows and an mss of 512, the probability of
a loss taking 3*rtt to detect is 7/8. 

The situation is actually worse than this on lossy circuits.
Because segments are not individually timed, all retransmits
will be timed 2*rtt from the last successful transfer (i.e.,
the last ack that moved snd_una).  This tends add the time
taken by previous retransmissions into the retransmission time
of the the current segment, increasing the mean rexmit time
and, thus, lowering the average throughput.  On a link with
a 5% loss rate, for long transfers, I've measured the mean time
to retransmit a segment as ~10*rtt.

The preceeding may not be clear without a picture (it sure took
me a long time to figure out what was going on) but I'll try to
give an example.  Say that the window is 4 segments, the rtt is
R, you want to ship segments A-G and segments B and D are going
to get dropped.  At time zero you spit out A B C D.  At time R you
get back the ack for A, set the retransmit timer to go off at 3R
("now" + 2*rtt), and spit out E.  At 3R the timer goes off and you
retransmit B.  At 4R you get back an ack for C, set the retransmit
timer to go off at 6R and transmit F G. At 6R the timer goes off,
you retransmit D.  [D should have been retransmitted at 2R.]  Even
if we count the retransmit of B delaying everything by 2R (in
what is essentially a congestion control measure), there is an
extra 2R added to D's retransmit because its retransmit time is
slaved to B's ack.  Also note that the average throughput has
gone from 8 packets in 2R (if no loss) to 8 packets in 7R, a
factor of four degradation.

The obvious fix here is to time each segment.  Unfortunately,
this would add 14 bytes to a tcpcb which would then no longer fit
in an mbuf.  So, we're still trying to decide what to do.  It's
(barely) possible to live within the space limitations by, say,
timing the first and last segments and assuming the segments were
generated at a uniform rate. 


7. the retransmit policy could be better.

In the preceeding example, you might have wondered why F G were
shipped after the ack for C rather than D.  If I'd changed the
example so that C was dropped rather than D, C D E F would have
been shipped when the ack for B came in (unnecessarily resending
D and E).  In either case the behavior is "wrong".  The reason it
happens is because an ack after a retransmit is treated the same
way as normal ack.  I.e., because of data that might be in
transit you ignore what the ack tells you to send next and just
use it to open the window.  But, because the ack after a
retransmit comes 3*rtt after the last new data was injected, the
two sides are essentially in sync and the ack usually does tell
you what to send next. 

It's pretty clear what the retransmit policy should be.  We
haven't even started looking into the details of implementing
that policy in tcp_input.c & tcp_output.c.  If a grad student
would like a real interesting project ...

------------
There's more but you're probably as tired of reading as I am of
writing.  If none of this helps and if you have any Sun-3s handy,
I can probably send you a copy of my tcp monitor (as long as our
lawyers don't find out).  This is something like "etherfind"
except it prints out timestamps and all the tcp protocol info.
You'll have to agree to post anything interesting you find out
though...

Good luck.

  - Van

-----------[000011][next][prev][last][first]----------------------------------------------------
Date:      Sun, 7-Dec-86 11:42:47 EST
From:      craig@loki.bbn.com.UUCP
To:        mod.protocols.tcp-ip
Subject:   clarification


> I suspect that what you are seeing is that fact that Sun TCP (to be
> fair, most 4.2 based TCP's) doesn't perform well with bad connections.
> A number of sites have replaced Sun's TCP modules with their 4.3
> equivalents.  This makes a dramatic difference in dealing with
> difficult cases.

    Re-reading my note I realize it isn't clear.  I'm seeing the behaviour
under both 4.2 and 4.3 (though I haven't tested the 4.3 implementation
as extensively).  I did the port to 4.3 for precisely the reasons you
mention and now find myself stymied.

Craig

-----------[000012][next][prev][last][first]----------------------------------------------------
Date:      8 Dec 1986 08:01-EST
From:      CERF@a.isi.edu
To:        craig@loki.bbn.com
Cc:        tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   Re: 4.2/4.3 TCP and long RTTs
Craig,

can you obtain traces of the TCP level exchanges to see whether the
death of the TCP connection is retransmission time-out related?


Vint
-----------[000013][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8-Dec-86 08:35:40 EST
From:      walsh@HARVARD.HARVARD.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  4.2/4.3 TCP and long RTTs

By having a maximum RTT of 2+ minutes, your RDP connection will stay open
at times when the Berkeley UNIX system will arbitrarily close the connection
after 30 seconds (rather than just informing the application about the
problem).  You are also seeing the benefits of EACKs.

bob

-----------[000014][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8 Dec 86 08:35:40 EST
From:      Bob Walsh <walsh@harvard.harvard.edu>
To:        craig@loki.bbn.com, tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   Re:  4.2/4.3 TCP and long RTTs
By having a maximum RTT of 2+ minutes, your RDP connection will stay open
at times when the Berkeley UNIX system will arbitrarily close the connection
after 30 seconds (rather than just informing the application about the
problem).  You are also seeing the benefits of EACKs.

bob
-----------[000015][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8 Dec 86 11:00:16 CST
From:      Linda Crosby <lcrosby@ALMSA-1.ARPA>
To:        TCP-IP@SRI-NIC.ARPA
Cc:        COOL@ALMSA-1.ARPA
Subject:   REPLYS TO TCP/IP - ETHERNET QUERY

Here is the consolidation of the meaningfull replys to our query to
the TCP/IP mailing list.

(And before anyone asks..Mr. Savacool originated the query,
I was the 'pipeline' since I am on the TCP-IP mailing list and 
Mr. Savacool is not.)

Thank you all,

Linda J. Crosby
lcrosby@almsa-1

------------------------------------------

The original question was :

>     I am in the process of establishing a TCP/IP network that will
>run on Broadband ethernet.  The prototype installation will consist of
>4 mainframe computers (VAX 780) and 8 or 16 users using IBM-PC clones.  The
>highwater mark could be as many as 8 mainframe computers and 400 IBM-PC
>clone users.
>
>     I would like to know if 400 users, as described above, running the
>standard suite of DOD protocols (TCP/IP, TELNET, FTP, SMTP) can be
>supported on a single ethernet ?  How many connections would be
>reasonable before the network begins to degrade ?  Is anyone familiar
>with a broadband ethernet, similar to what we are proposing, that is
>configured with 400 connections ?  Is this reasonable ?
>
>     I would like to here from anyone who has covered this ground
>before.  A successfull installation elsewhere would be a big
>confidence builder.

     I would like to thank all those who were kind enough to answer.
The replys ranged from several people who felt compelled to point out
that I didn't know what I was talking about when I talked about
ethernet (broadband or baseband) to many people who shared some genuine
insight about their own experiences.

     To those who still think that ethernet is only baseband I suggest
you contact ChipCom Corp,  SYTEK,  DEC  or any of the other vendors now
selling the ethernet on broadband products that were approved by the
IEEE 802.3 committee on September 19, 1985.

     Why do I want to do ethernet on broadband and not just regular
baseband ethernet ?  Simple,  We have a large investment in a
broadband cabling plant and it is not reasonable for us to lay new
cable,  it is much more reasonable for us to hang transceviers off of
our existing cable to provide the same function.

     The following are the messages that I found addressed my original
question.  Once again thanks for the many replys.  We are going to do
some of this soon (government procurements take at least 6
months)  so I will have some real world experience within the year.

Bob Savacool
cool@almsa-1

[   The opinions expressed herein are my own.  I have no affiliation
with any company or any products mentioned. ]

--------------------------------------------------------------------
///////////////////////////////////////////////////////////////////
--------------------------------------------------------------------

Resent From: Cpt Brian Boyter  ut-ngp!boyter <boyter@ngp.utexas.edu>
From:  BOWDENPE%VTVM1.BITNET@wiscvm.wisc.edu

We have several large departmental Ethernets which we would like to
interconnect.  At the same time, we would like to provide Ethernet
capability throughout our campus.  One option which we are exploring
is to run Ethernet on our existing broadband cable system, which currently
supports video and Sytek LocalNet 20.
Because of the size of the resulting Ethernet, and because multiple protocols
will be used (TCP/IP, Lat-11, DDCMP), we believe we will also need to have
some bridges placed at critical locations.  It appears that there are several
broadband Ethernet, and Ethernet bridge products on the market.  Among the
vendors are Chipcom, Ungermann Bass, Applitek, DEC, and CMC.

I'd be interested in hearing other folks' experiences with these products.
Particularly helpful would be any recommendations or "gotchas" in configuring
such a network.

-------------------------------

From: John Lekashman <lekash@AMES-NASB.ARPA>

Its somewhat reasonable.  It all depends on what these users are going
to be doing.  If you intend to bring up remote file servers on vaxes
for pc's, and some sort of network file system, then you will probably
lose when the count of PC's gets above about 20.  If you intend to
use the pc's as terminals and login to the vaxes for random computational
needs, then the count can probably go into several hundred.  
This is also true if the primary traffic is going to mail between
users.

One approach that you can use that is extensible upward for a long time
is to properly subnet into groups.  This does mean some additional
hardware, and choosing the correct software.  For example, you could
group 25 to 50 pc's that are likely to have traffic between them
on one cable, along with an associated larger machine (incidentally,
a vax is only a mini, or at best a 'super-mini') and feed this
into a packet switch on the same cable to another cable that N other
such things exist on.  (In your case N becomes 8)  In this way,
high traffic flows along each of the separate ethernets, and lower
density of flow from group to group.  Vaxes with appropriate 
software, (eg 4.3 unix from berkeley is one such OS that we use here
for such a purpose) can function as a subnet gateway.  If you need
it, I can point you at several vendors who will sell packet switches,
about 10K each.  These can switch at about 3 megabits currently,
with 1000 byte packets, so its not a terrible performance loss.

----------------------------------

From: Bob Napier <rwn@ORNL-MSR.ARPA>

We are assisting Lowry AFB (Denver) in evaluating/installing a broadband
network of up to 500 nodes for Office Automation.

I will be happy to provide you information on our experiences, although we
are just awaiting bids to  a RFP.

-----------------------------------

From: Charles Hedrick <hedrick%topaz.rutgers.arpa@ALMSA-1.ARPA>

We have 3 DEC-20's, 3 Pyramids, and a 785 on one Ethernet.  I don't
think you will have any problem.  It's real hard to saturate
an Ethernet, unless you are using diskless machines, which do their
paging over the network.

-----------------------------------

From:           Jeffrey C Honig <$JCH%CLVM.BITNET@wiscvm.wisc.edu>

I'm planning on doing something similar but, at least initially, on a
smaller scale.  I'm planning on making extensive use of LANBRIDGE 100's
to separate traffic into groups with a backbone connecting them.  That
could increase the traffic you could handle.

I'm planning on using Chipcom modems on the broadband, how about you?
I've researched the matter fairly carefully and have a paper that
describes my research and plans.  I can mail you a copy if you are
interested.

-----------------------------------

From: Dan Lynch <LYNCH%a.isi-venera.arpa@ALMSA-1.ARPA>

 The answer to your question is not easy.  I could easily work
up a scenario that would bog down -- like using the Ether
for remote file sharing (ala diskless workstations).  But if every host
is just using the Ether for mail, some random FTPs and some Telnetting
then it could work well.  The real question is what are those PC
users doing?  Using them as terminal emulators to get to the VAXen?  
Or as real hosts and only sending mail and spreadsheets around?

Anyway,  You should contact Charles Hedrick at Rutgers (Hedrick@Rutgers)
as he has a huge installation and knows a lot.
Can you give me your US Mail address so I can send you a brochure on
the upcoming TCP/IP Interoperability conference?  (March 87).

-------------------------------------

From: Dick Karpinski <dick@ccb.ucsf.edu>

No problem.  Each pair of nodes can use about 1 megabit/sec but unless
they're gonna really pass a bunch of data around, I'd expect to see
loadings like 2% and 4% for such a small net.  On baseband you get to
use like 70-80% before it degrades much.  It's 2-4 times worse on the
5 megabit ethernet on broadband, but you still have a safety factor 
of 4-8 before serious degradation.

But why use broadband at all??  The connections are more expensive and
less speedy and bigger too.  Buffered repeaters and filtered repeaters
like the DEC LANBridge handle length problems (and traffic levels too
with the filtering) at costs like $4k and $7K respectively.

We have about 30 VAXen and Suns etc and just about can't see the traffic.

------------------------------------

From: Bill Nowicki <nowicki@sun.com>

	I am in the process of establishing a TCP/IP network that will
	run on Broadband ethernet.

In a way, "Broadband ethernet" is an oxymoron. Ethernet is
baseband.  There are several companies that build systems that are
compatible with Ethernet transciever specs, but use broadband
signalling instead of baseband signalling.  Although this is an analog
issue that is mysterious to software types like me, the fact
that you modulate and demodulate should not help collision problems, so
you have the usual Ethernet length restrictions.  Of course you can
exceed the length restrictions, with possible collision problems.

	The prototype installation will consist of 4 mainframe
	computers (VAX 780) and 8 or 16 users using IBM-PC clones.

Your definition of "mainframe" is interesting, since the workstation on
my desk is twice as fast, and is the middle of our line.  At any
rate, we have many Ethernets with up to about 100 Sun-3 machines.  It is
interesting that bandwidth is not the first limitation.  The main
reason you don't want more than about 100 machines is that one faulty
machine can bring the whole network down.  The probability that someone
shorts the cable, or starts to continuously broadcast, or at least has
bad collision detection circuitry, becomes pretty close to one with
over 100 nodes.

Of course this might be because Ethernet networks just evolve, while
broadband networks are usually "planned".  We make each floor of each
building its own Ethernet, with additional Ethernets for labs.  You can
make a Sun into a gateway just by sliding in another Ethernet
controller.  It also helps to use transceiver multiplexor boxes such as
the ones made by TCL, to reduce the number of actual taps, (less likely
to short the cable). So you probably can put 400 PCs onto a single net,
but do you want to?


--------------------------------------------------------------------
///////////////////////////////////////////////////////////////////
--------------------------------------------------------------------

-----------[000016][next][prev][last][first]----------------------------------------------------
Date:      8 Dec 1986 10:14-EST
From:      CLYNN@g.bbn.com
To:        craig@loki.bbn.com
Cc:        tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   Re: 4.2/4.3 TCP and long RTTs
Craig,
	I would suggest that you try limiting unacknowledged data on the TCP
connection to an amount which will not cause fragmentation over the paths
that you are using.  I suspect that the problem you are seeing is that
packets are being generated which require fragmentation.  The usual
fragmentation algorithm dumps n consecutive packets into the net; if
any of the n get lost, reassembly cannot occur and retransmission is
required.  It was formerly the case that some interfaces could not receive
back to back packets from the net; it may still be; if so, a fragmented
packet will "never" get through.  Also, I suspect that the IP sequence
number in retransmitted TCP packets is different from the original;
consequently, fragments from different retransmissions cannot be recombined.
Does your RDP implementation specify the IP sequence number?  Can you tell
if your host is receiving fragments and having to discard them?  Do you
get any ICMP fragment reassembly time exceeded messages?  Can you find out
what the initial time-to-live value is for the TCP connection and the RDP?
Let us know what you find.

Charlie
-----------[000017][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8-Dec-86 11:57:21 EST
From:      braden@ISI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  4.2/4.3 TCP and long RTTs


Craig:

The amount of your hair pulling must be small compared to the time integral
of hair pulled by our UCL friends over the years.  Quite simply, their
conclusion was that most TCP implementations have design problems
that make them behave poorly over paths which have very long delay and
moderate to high loss.  SATNET sometimes (often?) exhibits that behaviour.
Recently, the ARPANET+core_gateway system has also exhibited that 
behaviour, and many TCP's have not been up to it... lots of broken
connections, etc.

I suggest that the cause of this situation is a performance/ robustness
tradeoff inherent to TCP implementations.  Most of the
currently-available TCPs have been implemented and tested in an LAN
environment, to provide optimal performance in a low-delay, low-error
situation.  On the other hand, when we wrote the original experimental
implementations of TCP, we found the little beasties to be amazingly
robust; they would tenaciously hold on for minutes (or hours!)
retransmitting until a path came back, and would get the data through in
spite of terrible bugs.  But we were writing and testing them for
equally-experimental gateway implementations and frequently testing to
UCL, and did not demand high throughput or low delay.

It would certainly be interesting to understand exactly how these TCP;s
have failed. I suspect it is a combination of a Zhang-catastrophe (RTT
measurement diverging towards infinity due to high loss rate) with an
implementation-imposed upper bound on retransmission time before the
connection breaks.  On the other hand, the answer may be that selective
retransmission is really absolutely essential to deal with the
long-delay, lossy situation.  I would like to get someone interesting in
running some experiments on this (maybe you just did??) Would it be
possible for you to disable just the selective retransmisssion feature of
RDP and try again?

Bob Braden 

 

-----------[000018][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8-Dec-86 12:00:16 EST
From:      lcrosby@ALMSA-1.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   REPLYS TO TCP/IP - ETHERNET QUERY


Here is the consolidation of the meaningfull replys to our query to
the TCP/IP mailing list.

(And before anyone asks..Mr. Savacool originated the query,
I was the 'pipeline' since I am on the TCP-IP mailing list and 
Mr. Savacool is not.)

Thank you all,

Linda J. Crosby
lcrosby@almsa-1

------------------------------------------

The original question was :

>     I am in the process of establishing a TCP/IP network that will
>run on Broadband ethernet.  The prototype installation will consist of
>4 mainframe computers (VAX 780) and 8 or 16 users using IBM-PC clones.  The
>highwater mark could be as many as 8 mainframe computers and 400 IBM-PC
>clone users.
>
>     I would like to know if 400 users, as described above, running the
>standard suite of DOD protocols (TCP/IP, TELNET, FTP, SMTP) can be
>supported on a single ethernet ?  How many connections would be
>reasonable before the network begins to degrade ?  Is anyone familiar
>with a broadband ethernet, similar to what we are proposing, that is
>configured with 400 connections ?  Is this reasonable ?
>
>     I would like to here from anyone who has covered this ground
>before.  A successfull installation elsewhere would be a big
>confidence builder.

     I would like to thank all those who were kind enough to answer.
The replys ranged from several people who felt compelled to point out
that I didn't know what I was talking about when I talked about
ethernet (broadband or baseband) to many people who shared some genuine
insight about their own experiences.

     To those who still think that ethernet is only baseband I suggest
you contact ChipCom Corp,  SYTEK,  DEC  or any of the other vendors now
selling the ethernet on broadband products that were approved by the
IEEE 802.3 committee on September 19, 1985.

     Why do I want to do ethernet on broadband and not just regular
baseband ethernet ?  Simple,  We have a large investment in a
broadband cabling plant and it is not reasonable for us to lay new
cable,  it is much more reasonable for us to hang transceviers off of
our existing cable to provide the same function.

     The following are the messages that I found addressed my original
question.  Once again thanks for the many replys.  We are going to do
some of this soon (government procurements take at least 6
months)  so I will have some real world experience within the year.

Bob Savacool
cool@almsa-1

[   The opinions expressed herein are my own.  I have no affiliation
with any company or any products mentioned. ]

--------------------------------------------------------------------
///////////////////////////////////////////////////////////////////
--------------------------------------------------------------------

Resent From: Cpt Brian Boyter  ut-ngp!boyter <boyter@ngp.utexas.edu>
From:  BOWDENPE%VTVM1.BITNET@wiscvm.wisc.edu

We have several large departmental Ethernets which we would like to
interconnect.  At the same time, we would like to provide Ethernet
capability throughout our campus.  One option which we are exploring
is to run Ethernet on our existing broadband cable system, which currently
supports video and Sytek LocalNet 20.
Because of the size of the resulting Ethernet, and because multiple protocols
will be used (TCP/IP, Lat-11, DDCMP), we believe we will also need to have
some bridges placed at critical locations.  It appears that there are several
broadband Ethernet, and Ethernet bridge products on the market.  Among the
vendors are Chipcom, Ungermann Bass, Applitek, DEC, and CMC.

I'd be interested in hearing other folks' experiences with these products.
Particularly helpful would be any recommendations or "gotchas" in configuring
such a network.

-------------------------------

From: John Lekashman <lekash@AMES-NASB.ARPA>

Its somewhat reasonable.  It all depends on what these users are going
to be doing.  If you intend to bring up remote file servers on vaxes
for pc's, and some sort of network file system, then you will probably
lose when the count of PC's gets above about 20.  If you intend to
use the pc's as terminals and login to the vaxes for random computational
needs, then the count can probably go into several hundred.  
This is also true if the primary traffic is going to mail between
users.

One approach that you can use that is extensible upward for a long time
is to properly subnet into groups.  This does mean some additional
hardware, and choosing the correct software.  For example, you could
group 25 to 50 pc's that are likely to have traffic between them
on one cable, along with an associated larger machine (incidentally,
a vax is only a mini, or at best a 'super-mini') and feed this
into a packet switch on the same cable to another cable that N other
such things exist on.  (In your case N becomes 8)  In this way,
high traffic flows along each of the separate ethernets, and lower
density of flow from group to group.  Vaxes with appropriate 
software, (eg 4.3 unix from berkeley is one such OS that we use here
for such a purpose) can function as a subnet gateway.  If you need
it, I can point you at several vendors who will sell packet switches,
about 10K each.  These can switch at about 3 megabits currently,
with 1000 byte packets, so its not a terrible performance loss.

----------------------------------

From: Bob Napier <rwn@ORNL-MSR.ARPA>

We are assisting Lowry AFB (Denver) in evaluating/installing a broadband
network of up to 500 nodes for Office Automation.

I will be happy to provide you information on our experiences, although we
are just awaiting bids to  a RFP.

-----------------------------------

From: Charles Hedrick <hedrick%topaz.rutgers.arpa@ALMSA-1.ARPA>

We have 3 DEC-20's, 3 Pyramids, and a 785 on one Ethernet.  I don't
think you will have any problem.  It's real hard to saturate
an Ethernet, unless you are using diskless machines, which do their
paging over the network.

-----------------------------------

From:           Jeffrey C Honig <$JCH%CLVM.BITNET@wiscvm.wisc.edu>

I'm planning on doing something similar but, at least initially, on a
smaller scale.  I'm planning on making extensive use of LANBRIDGE 100's
to separate traffic into groups with a backbone connecting them.  That
could increase the traffic you could handle.

I'm planning on using Chipcom modems on the broadband, how about you?
I've researched the matter fairly carefully and have a paper that
describes my research and plans.  I can mail you a copy if you are
interested.

-----------------------------------

From: Dan Lynch <LYNCH%a.isi-venera.arpa@ALMSA-1.ARPA>

 The answer to your question is not easy.  I could easily work
up a scenario that would bog down -- like using the Ether
for remote file sharing (ala diskless workstations).  But if every host
is just using the Ether for mail, some random FTPs and some Telnetting
then it could work well.  The real question is what are those PC
users doing?  Using them as terminal emulators to get to the VAXen?  
Or as real hosts and only sending mail and spreadsheets around?

Anyway,  You should contact Charles Hedrick at Rutgers (Hedrick@Rutgers)
as he has a huge installation and knows a lot.
Can you give me your US Mail address so I can send you a brochure on
the upcoming TCP/IP Interoperability conference?  (March 87).

-------------------------------------

From: Dick Karpinski <dick@ccb.ucsf.edu>

No problem.  Each pair of nodes can use about 1 megabit/sec but unless
they're gonna really pass a bunch of data around, I'd expect to see
loadings like 2% and 4% for such a small net.  On baseband you get to
use like 70-80% before it degrades much.  It's 2-4 times worse on the
5 megabit ethernet on broadband, but you still have a safety factor 
of 4-8 before serious degradation.

But why use broadband at all??  The connections are more expensive and
less speedy and bigger too.  Buffered repeaters and filtered repeaters
like the DEC LANBridge handle length problems (and traffic levels too
with the filtering) at costs like $4k and $7K respectively.

We have about 30 VAXen and Suns etc and just about can't see the traffic.

------------------------------------

From: Bill Nowicki <nowicki@sun.com>

	I am in the process of establishing a TCP/IP network that will
	run on Broadband ethernet.

In a way, "Broadband ethernet" is an oxymoron. Ethernet is
baseband.  There are several companies that build systems that are
compatible with Ethernet transciever specs, but use broadband
signalling instead of baseband signalling.  Although this is an analog
issue that is mysterious to software types like me, the fact
that you modulate and demodulate should not help collision problems, so
you have the usual Ethernet length restrictions.  Of course you can
exceed the length restrictions, with possible collision problems.

	The prototype installation will consist of 4 mainframe
	computers (VAX 780) and 8 or 16 users using IBM-PC clones.

Your definition of "mainframe" is interesting, since the workstation on
my desk is twice as fast, and is the middle of our line.  At any
rate, we have many Ethernets with up to about 100 Sun-3 machines.  It is
interesting that bandwidth is not the first limitation.  The main
reason you don't want more than about 100 machines is that one faulty
machine can bring the whole network down.  The probability that someone
shorts the cable, or starts to continuously broadcast, or at least has
bad collision detection circuitry, becomes pretty close to one with
over 100 nodes.

Of course this might be because Ethernet networks just evolve, while
broadband networks are usually "planned".  We make each floor of each
building its own Ethernet, with additional Ethernets for labs.  You can
make a Sun into a gateway just by sliding in another Ethernet
controller.  It also helps to use transceiver multiplexor boxes such as
the ones made by TCL, to reduce the number of actual taps, (less likely
to short the cable). So you probably can put 400 PCs onto a single net,
but do you want to?


--------------------------------------------------------------------
///////////////////////////////////////////////////////////////////
--------------------------------------------------------------------

-----------[000019][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8-Dec-86 13:15:31 EST
From:      karels%okeeffe@UCBVAX.BERKELEY.EDU (Mike Karels)
To:        mod.protocols.tcp-ip
Subject:   Re: 4.2/4.3 TCP and long RTTs

Bob,
The timeout for TCP on 4.2/3 is rather longer than 30 sec.
The actual time depends on the round-trip time, as the limit
is on the number of retransmissions.  On 4.3, the timeout
is at least 108 sec. with short RTT's.  The limit on 4.2
was nearer 45 sec.  As Van Jacobson says in his message,
the keepalive time will have to be adjusted for long RTT's
as well, but I doubt that Craig is using the keepalive timer.

How can the TCP "just" inform the application about the problem?
Unless there's a control channel to the application that allows
the passage of status data, a send call must return an error.
After that error, the application can't tell how much of the data,
if any, was transmitted.  "Reliable byte stream with possible gaps
in case of error" isn't very satisfying.

		Mike

-----------[000020][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8-Dec-86 13:32:44 EST
From:      walsh@HARVARD.HARVARD.EDU (Bob Walsh)
To:        mod.protocols.tcp-ip
Subject:   Re: 4.2/4.3 TCP and long RTTs

Mike,
	How can the TCP "just" inform the application about the problem?
	Unless there's a control channel to the application that allows
	the passage of status data, a send call must return an error.
	After that error, the application can't tell how much of the data,
	if any, was transmitted.  "Reliable byte stream with possible gaps
	in case of error" isn't very satisfying.

Informing the application that the networking system is having trouble
getting acknowledgements does not mean that the networking system has
given up on sending that data.  My intent was to point out that such a
decision should be left up to the application, which may in turn defer
the decision to the user.  This was one of the qualities of the BBN
TCP/IP software, as you know.  It is one of the reasons Bob Gilligan used
the BBN software for his demos.

Whether it is 30, 45 or 108 seconds doesn't matter.  What does matter is
that Craig is preserving the RDP connection for a longer time under such
circumstances and therefore is less likely to see the connection fail.

There is also the point of extended acknowledgements.

Bob Walsh

-----------[000021][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8 Dec 86 13:32:44 EST
From:      Bob Walsh <walsh@harvard.harvard.edu>
To:        karels%okeeffe@berkeley.edu, walsh@HARVARD.HARVARD.EDU
Cc:        craig@loki.bbn.com, tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   Re: 4.2/4.3 TCP and long RTTs
Mike,
	How can the TCP "just" inform the application about the problem?
	Unless there's a control channel to the application that allows
	the passage of status data, a send call must return an error.
	After that error, the application can't tell how much of the data,
	if any, was transmitted.  "Reliable byte stream with possible gaps
	in case of error" isn't very satisfying.

Informing the application that the networking system is having trouble
getting acknowledgements does not mean that the networking system has
given up on sending that data.  My intent was to point out that such a
decision should be left up to the application, which may in turn defer
the decision to the user.  This was one of the qualities of the BBN
TCP/IP software, as you know.  It is one of the reasons Bob Gilligan used
the BBN software for his demos.

Whether it is 30, 45 or 108 seconds doesn't matter.  What does matter is
that Craig is preserving the RDP connection for a longer time under such
circumstances and therefore is less likely to see the connection fail.

There is also the point of extended acknowledgements.

Bob Walsh
-----------[000022][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8-Dec-86 13:44:47 EST
From:      van@LBL-CSAM.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: 4.2/4.3 TCP and long RTTs

I've been told that my 6th & 7th points (4.3bsd retransmit timers
need some work) were incomprehensible.  That's what you get
when you reply to messages at 3am Sunday morning.  Since the
retransmit timer behavior results in the biggest performance
loss (the other problems affect congestion & stability more than
performance), I'll take a crack at explaining it better.

Attached is a picture of the problem.  It is taken directly from
a trace but the window size has been reduced from 8*MSS to 4*MSS
to simplify the drawing.  Time runs down the page.  The time axes
has tick marks at multiples of the round trip time, R.  The sender
is on the left, the receiver on the right.  Seven segment are
sent, labeled A through G.  Two segments, B and D, get lost or 
damaged in transit.  A lower case letter is used for a receiver's
ack (e.g., "g" is the ack for all bytes up to and including the
last byte of segment "G").  A list of all the segments successfully
received so far is in square brackets at the point where each
ack is generated.  Holes in the sequence space are indicated by "-".

All the traffic goes one direction (this was an ftp).  4.3 almost
always sends MSS byte segments and all these were of size MSS (512B).
Because of the 4.3 delayed ack code, the receiver almost always
reports a full size window (4KB in 4.3, 2KB in this example) in an
ack.  All these acks report a 4 MSS (2KB) window.

All sends are timed.  The retransmit timer is set to 2 times the
smoothed round trip time (TCP_BETA * t_srtt).  The timer is set
on each ack that's not a duplicate of a previous ack (i.e., that
changes the "sent but unacknowleged" pointer, snd_una).  If the
timer times out, the segment starting at snd_una is retransmitted
and the timer is restarted at 2*srtt.  Exactly one segment is
retransmitted.  Periodic retransmissions of that segment continue
until it is acked.  When the segment is acked, the retransmit
timer is set to 2*srtt and "normal" behavior resumes (see rfc793
if you're not sure what normal behavior is). 

	  0-| A				(set timer to 2R, send enough
	    | B\			 packets to fill window (4))
	    | C\\
	    | D\\\
	    |  \\*\
	    |   *\ a [A]		(ack A)
	    |     X
	    |    / a [A - C]		(save C but can only ack through A)
	    |   / /
	    |  / /
	 1R-| E /			(A ack received, set timer to 3R,
	    |  \			 ack opens window by 1 so send E)
	    | - \			(duplicate A ack discarded)
	    |    \
	    |     \
	    |      a [A - C - E]	(save E but can only ack through A)
	    |     /
	    |    /
	    |   /
	    |  /
	 2R-| -				(duplicate A ack discarded)
	    | 
	    |
	    |
	    |
	    |
	    |
	    |
	    |
	    |
	 3R-| B				(timer goes off, rexmit first
	    |  \			 unacked segment (B), timer set to 5R)
	    |   \
	    |    \
	    |     \
	    |      c [A B C - E]	("B" fills in sequence space up
	    |     /			 through "C", ack C)
	    |    /
	    |   /
	    |  /
	 4R-| F				(ack of C opens window for 2 more
	    | G\			 segments, timer set to 6R)
	    |  \\
	    |   \\
	    |    \\
	    |     \c [A B C - E F]	(missing D, can only ack through C)
	    |     /c [A B C - E F G]
	    |    //
	    |   //
	    |  //
	 5R-| -/			(duplicate acks for C discarded)
	    | -
	    |
	    |
	    |
	    |
	    |
	    |
	    |
	    |
	 6R-| D				(timer goes off, rexmit first
	    |  \			 unacked segment (D), timer set 8R)
	    |   \
	    |    \
	    |     \
	    |      g [A B C D E F G]	(sequence space complete, ack G)
	    |     /
	    |    /
	    |   /
	    |  /
	 7R-| 


There are two problems here: the gap between 2R & 3R and the fact
that we don't send D at 4R.  The idle time from 2-3 (and from
5-6) happens because our timer is always 2*R from the last useful
ack and is essentially unrelated to when a segment is originally
sent (The code wasn't intended to work this way and on low delay
circuits it works correctly.) We should really be retransmitting
B 2*R from its first transmission (i.e., 1 line after the 2R tick
mark).  It's not too hard to show analytically that this (the
current 4.3 algorithm) "feeds forward" (e.g., the recovery for D
is moved later in time and is more likely to conflict with F,G
recovery) which is why throughput degrades much faster than
linearly with increasing loss rate. 

You can view the late transmission of D two ways.  It could be
another example of the timer problem.  I.e., we should have
retransmitted D 3 lines after the 2R tick.  We held off sending
it then because we thought the network might be congested and we
wanted to send a minimum amount of data until we got back an
indication (the ack) that the congestion had cleared up.  But we
certainly should have sent D at 4R when we got the "c" ack. 

Or you can say that when we get the "c" ack after the
retransmission of B, no packets have been injected into the
network for 2*R.  The ack tells you pretty clearly that the
receiver is missing D. (Either point of view will do the "right"
thing in this case but treating a retransmit ack specially buys
you a bit in one other case). 

If the two problems are corrected, the total time drops from 7R
to 4R (2R is the total time if no packets are lost).  If we don't
do the send-1-packet-on-rexmit congestion control, the total time
drops to 3R, the mimimum possible if one or more packets is
dropped. 

Also, this partly illustrates why I thought Craig's measurements
demonstrated a problem in TCP rather than the superiority of RDP.
Even with EACKs, it takes RDP 3R to send the data if the same two
packets are lost, exactly the same time it takes TCP.  I think I
can show that EACKs aren't a big win until the drop rate is >50%,
if TCP is working as well as it can (that's not to say RDP isn't
a win for other reasons). 

  - Van

-----------[000023][next][prev][last][first]----------------------------------------------------
Date:      Mon, 8-Dec-86 20:50:59 EST
From:      BUDDENBERGRA@A.ISI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   DDN implementation

Pardon me if I should post this to another board, but...
What is the status of DDN implementation as replacement for Autodin?
I'm concerned because we (USCG) are about to buy into a bunch
of ports that are called 'Autodin' up front.  Are they likely to
be DDN under the hood?

Where do we find this information on a continuing and updated basis?
tks
-------

-----------[000024][next][prev][last][first]----------------------------------------------------
Date:      8 Dec 1986 20:50:59 EST
From:      Rex Buddenberg <BUDDENBERGRA@A.ISI.EDU>
To:        tcp-ip@SRI-NIC.ARPA
Cc:        BUDDENBERGRA@A.ISI.EDU
Subject:   DDN implementation
Pardon me if I should post this to another board, but...
What is the status of DDN implementation as replacement for Autodin?
I'm concerned because we (USCG) are about to buy into a bunch
of ports that are called 'Autodin' up front.  Are they likely to
be DDN under the hood?

Where do we find this information on a continuing and updated basis?
tks
-------
-----------[000025][next][prev][last][first]----------------------------------------------------
Date:      Tue, 9-Dec-86 17:15:16 EST
From:      roden%husc4@HARVARD.HARVARD.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   (none)


	We have installed a product called FTP-VMS, a TCP/IP Ethernet File
Transfer Software from a company called Process Software in Amherst, MA
(413) 549-6994, on a trial basis.

	According to the product literature, it

		o  operates with DEC Ethernet/802.3 Hardware
		o  implements TCP/IP FTP Networking Standard
		o  shares Ethernet Hardware with DECnet
		o  works with TELNET-VMS (another of their products)

	I am interested to know if anyone else uses this product, and 
if so, what are their experiences?   Please send me mail directly. 

	Thank you.

				Peter Roden, VMS Systems Manager
				Harvard University Science Center

Bitnet:    roden@harvsc3		
UUCP:      ihnp4!wjhiz!HUSC3!roden
internet:  roden%husc4@harvard.harvard.edu
voice:     (617) 495-1270		

-----------[000026][next][prev][last][first]----------------------------------------------------
Date:      Tue, 9 Dec 86 17:15:16 EST
From:      roden%husc4@harvard.HARVARD.EDU
To:        tcp-ip@sri-nic.arpa
	We have installed a product called FTP-VMS, a TCP/IP Ethernet File
Transfer Software from a company called Process Software in Amherst, MA
(413) 549-6994, on a trial basis.

	According to the product literature, it

		o  operates with DEC Ethernet/802.3 Hardware
		o  implements TCP/IP FTP Networking Standard
		o  shares Ethernet Hardware with DECnet
		o  works with TELNET-VMS (another of their products)

	I am interested to know if anyone else uses this product, and 
if so, what are their experiences?   Please send me mail directly. 

	Thank you.

				Peter Roden, VMS Systems Manager
				Harvard University Science Center

Bitnet:    roden@harvsc3		
UUCP:      ihnp4!wjhiz!HUSC3!roden
internet:  roden%husc4@harvard.harvard.edu
voice:     (617) 495-1270		
-----------[000027][next][prev][last][first]----------------------------------------------------
Date:      Wed, 10-Dec-86 17:12:10 EST
From:      Charles_Russell_Severance@UM.CC.UMICH.EDU
To:        mod.protocols.tcp-ip
Subject:   (none)

We at Michigan State University are implementing a campus
wide Ethernet over broadband cable.  I would like to ask two
questions:
 
1.   We have installed a Fibronics K-320 on our IBM mainframe
     to do TELNET and FTP.   We can't seem to find a satisfactory
     program to run in our PC compatibles with 3COM cards.  We
     are looking for people who are using the K-320 with 3COM
     cards doing 3270 emulation.  We would like to find sites
     who are satisfied with a TELNET program on the PC when
     communicating with a K-320.
 
2.   We are looking for possible network analyzers to use when
     debugging network problems.  The only one that we currently
     have any information on is one from EXCELAN which is $9,500
     for a card and some PC software.  We have access to both
     a SUN and AT-compatible equipment.
 
BITNET:     20095CRS@MSU
PHONE:      517-353-2984
ADRESS:     Charles Severance
            Michigan State University
            301 Computer Center
            East Lansing, MI 48824
 
Thank you in advance.

-----------[000028][next][prev][last][first]----------------------------------------------------
Date:      Fri, 12-Dec-86 00:07:53 EST
From:      gds@EDDIE.MIT.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: 4.2/4.3 TCP and long RTTs

> Mike,
> 	How can the TCP "just" inform the application about the problem?
> 	Unless there's a control channel to the application that allows
> 	the passage of status data, a send call must return an error.
> 	After that error, the application can't tell how much of the data,
> 	if any, was transmitted.  "Reliable byte stream with possible gaps
> 	in case of error" isn't very satisfying.
> 
> Informing the application that the networking system is having trouble
> getting acknowledgements does not mean that the networking system has
> given up on sending that data.  My intent was to point out that such a
> decision should be left up to the application, which may in turn defer
> the decision to the user.  This was one of the qualities of the BBN
> TCP/IP software, as you know.  It is one of the reasons Bob Gilligan used
> the BBN software for his demos.
> 

There are some Unix applications, like the 4.2 version of telnet,
which do a close() if they get something like ETIMEDOUT, which can
occur if a TCP timer has gone off.  In the 4.2 BBN TCP/IP, the timer
going off does not cause the connection to be closed.  Instead, a
routine called advise_user lets the higher layers know about the
problem and lets them deal with it.  I was rather surprised to see
that 4.2 telnet was giving up when the actual connection was not
remotely closed.  With a quick patch to telnet to prevent it from
closing for errors when a TCP timer has gone off, you can maintain
telnet connections for hours.  The reconstitution protocols required
that connections remain open for long periods of time during network
dynamics.

I haven't looked at 4.3 telnet or TCP to see if this is fixed or
handled differently.

--gregbo

-----------[000029][next][prev][last][first]----------------------------------------------------
Date:      Fri, 12-Dec-86 02:21:18 EST
From:      rhc@hplb.CSNET.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  4.2/4.3 TCP and long RTTs

I would like to enforce Bobs concern at the Internet layer. As an
ex-UCL person I am also concerned at the current tradeoff in TCP
implementations towards LAN-specific high performance and away from
Internet-general robustness.

The maximum packet size on SATNET is 256 bytes. If you are sending
large (>576) TCP packets then the gateway to the ARPANET will
fragment, then the ARPANET/SATNET gateway will fragment each of these
again. The result is a large number of fragments bursting onto SATNET,
pinging off Goonhilly (Why do you have to use the busiest European
earth station anyway) then each fragment tries to get back across the
ARPANET. At the SATNET/ARPANET gateway we have the ARPANET flow
control problem. It is quite likely that every packet sent from the
host has turned into at least 4 and possibly 6 packets for the return!
Have you considered your reassembly timeout! 
I have seen packets hang around in these gateways for 16 minutes (yes
thats right MINUTES).
After a little while the gateway queues will fill up and the TCP will
timeout because the IP reassembly is not able to get enough fragments
to build enough packets to keep the TCP happy.
Now if the TCP was able to use the same IP packet number for all
retransmissions then the situation will improve - and I suspect that
this is where RDP is winning!

Just because the TCP connection is failing does not mean that the TCP
layer is the only one that is broken.

Happy satellites,
Robert.

-----------[000030][next][prev][last][first]----------------------------------------------------
Date:      Fri, 12-Dec-86 14:57:00 EST
From:      stanonik@NPRDC.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   4.3bsd/telnet break

We've had some users, who connect to us via tacs,
complain that programs now mysteriously crash.  One hunch
is that we're receiving telnet breaks (iac brk), which the
4.3bsd telnet server turns into the dump core signal (SIGQUIT).
So, the question is, "Can a telnet break be sent from a tac
and, if so, how?".  I haven't found much information on tacs
(any suggestions, I've tried rfcs and nic tacnews), but tacs
do use "@" as an escape character (which has caused grief
when users try to send mail to say, user@colorado.edu).  Is
there a tac escape sequence for sending the telnet break;
eg, @b?

Thanks,

Ron Stanonik
stanonik@nprdc.arpa

-----------[000031][next][prev][last][first]----------------------------------------------------
Date:      Fri, 12-Dec-86 15:40:35 EST
From:      ahill@CC7.BBN.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: 4.3bsd/telnet break

Ron,
	Yes.  There is a TAC escape sequence that will cause 
breaks to be transmitted to a host.  The sequence is @s b.  There
are probably others but that one works.  The default command intercept
is @ and it can be transmitted through the TAC by doubling the character,
ie, two @'s.  If there is need, you can change the intercept.

Alan

-----------[000032][next][prev][last][first]----------------------------------------------------
Date:      Fri, 12 Dec 86 15:40:35 EST
From:      "Alan R. Hill" <ahill@cc7.bbn.com>
To:        stanonik@nprdc.arpa
Cc:        tcp-ip@sri-nic.arpa
Subject:   Re: 4.3bsd/telnet break
Ron,
	Yes.  There is a TAC escape sequence that will cause 
breaks to be transmitted to a host.  The sequence is @s b.  There
are probably others but that one works.  The default command intercept
is @ and it can be transmitted through the TAC by doubling the character,
ie, two @'s.  If there is need, you can change the intercept.

Alan

-----------[000033][next][prev][last][first]----------------------------------------------------
Date:      Fri, 12-Dec-86 17:00:00 EST
From:      KNapier@DDN2.UUCP.UUCP
To:        mod.protocols.tcp-ip
Subject:   Need info on PC/IP

I am placing this request for a co-worker of mine.

He would like to know where he can get information on PC/IP.  The
type of information would be of a general nature. After a review of
the available information a more specific request may be made.

Thanks in advance.

//Ken//KNapier@DDN2

-----------[000034][next][prev][last][first]----------------------------------------------------
Date:      12 Dec 86 17:00 EST
From:      KNapier @ DDN2
To:        tcp-ip @ sri-nic.arpa
Cc:        KNapier @ DDN2
Subject:   Need info on PC/IP
I am placing this request for a co-worker of mine.

He would like to know where he can get information on PC/IP.  The
type of information would be of a general nature. After a review of
the available information a more specific request may be made.

Thanks in advance.

//Ken//KNapier@DDN2

-----------[000035][next][prev][last][first]----------------------------------------------------
Date:      12 Dec 1986 17:49-EST
From:      CERF@A.ISI.EDU
To:        stanonik@NPRDC.ARPA
Cc:        tcp-ip@SRI-NIC.ARPA
Subject:   Re: 4.3bsd/telnet break
There is a command of the form @s b for send break. You might
try that to see if you can duplicate the problem at will.

vint
-----------[000036][next][prev][last][first]----------------------------------------------------
Date:      Sat, 13-Dec-86 14:15:29 EST
From:      OLE@SRI-NIC.ARPA (Ole Jorgen Jacobsen)
To:        mod.protocols.tcp-ip
Subject:   Re: Need info on PC/IP


Perhaps it is time once again to remind you all that information about
TCP/IP implementations for a number of different machines can be found
in [SRI-NIC.ARPA]NETINFO:VENDORS-GUIDE.DOC. This file is updated frequently
and a printed version is also available from the NIC.

Ole
-------

-----------[000037][next][prev][last][first]----------------------------------------------------
Date:      Sat, 13-Dec-86 18:22:39 EST
From:      ROMKEY@XX.LCS.MIT.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Need info on PC/IP

Anyone who has questions about PC/IP can direct them to me; I can
answer questions about the MIT version, the FTP Software version
and most of the academic spinoffs. 'Course, I might be somewhat
prejudiced in the matter...

John Romkey			FTP Software, Inc.
(617) 864-1711			PO Box 150
UUCP: romkey@mit-vax.UUCP	Kendall Square Branch
ARPA: romkey@xx.lcs.mit.edu	Boston, MA 02142
-------

-----------[000038][next][prev][last][first]----------------------------------------------------
Date:      Sun, 14 Dec 86 11:52:37 -0500
From:      Craig Partridge <craig@loki.bbn.com>
To:        tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   TCP RTT woes revisited

    This weekend I had time to start processing Van Jacobson's suggested
fixes/modifications.  Things started working very well after the first fix
which made TCP choose better fragment sizes and increased the time to live
for IP fragments.

    The subsequent testing also revealed some interesting results.  (These
are preliminary and subject to reappraisal).

    (1) EACKs appear to make a huge difference in use of the network.
    After seeing signs this was the case, I ran the simple test of
    pushing 50,000 data packets though a software loopback that
    dropped 4% of the packets.
    
    With EACKs there were 1,930 retransmissions, of which 1 received
    packet was a duplicate (note that some of the retransmissions were
    also dropped).

    Without EACKS there were 12,462 retransmissions of which 9,344
    received packets were duplicates.

    12,462 retransmissions is, of course, bad news, and comes from
    the fact that this RDP sends up to four packets in parallel.
    Typically the four get put into the send queue in the same
    tick of the timer, so when the first gets retransmitted,
    all four do.  The moral seems to be use EACKs even though
    they aren't required for a conforming implementation.

    (2) Lixia Zhang's suggestion that one use the RTT of the SYN to
    compute the initial timeout estimate appears to work very well.

    (3) EACKs may make it possible to all but stomp out RTT feedback
    (those unfortunate cases where a dropped packet leds to an
    RTT = (the number of retries * SRTT) + SRTT being used to compute
    a new SRTT.  I've been experimenting with discarding RTTs for out of
    order acks.  This is best explained by example.  If packets 1, 2, 3
    and 4 are sent, and the first ack is an EACK for 3, the implementation
    uses the RTT for 3 to recompute the SRTT, but will discard the RTTs
    for 1 and 2 when they are eventually acked (or EACKed).  The
    argument in favor of this scheme is that the acks for 1, and 2
    probably represent either (a) RTTs for packets that were dropped,
    and thus including them would lead to feedback or (b) RTTs that reflect 
    an earlier (and slower) state of the network (3 was sent after 1 and 2)
    and using them would make the SRTT a less good prediction of the
    RTT of the next packet.  Note that (b) would be more convincing
    if it wasn't the case that 1, 2, 3 and 4 were probaby sent within
    a few milliseconds of each other.

    Watching 5 trial runs of 100 64-byte data packets bounced off Goonhilly
    this algorithm kept the SRTT within the observed range of real RTTs
    (as opposed to RTTs for packets that were dropped and had to be
    retransmitted).

    Using EACKs but taking the RTT for every packet, (again doing 5 trial
    runs) several cases of RTT-feedback were seen.  In one case the SRTT
    soared to ~35 seconds when a few packets were dropped in a short period.
    Since the implementation uses Mill's suggested changes which make
    lowering the SRTT take longer than raising it, the SRTT took some
    time to recover.

People may be wondering about observed throughput.  How fast does RDP
run vis-a-vis TCP?  That turns out to be very difficult to answer.
Identical tests run in parallel or one right after another give
throughput rates that vary by factors of 2 of more.  As a result it
is difficult to get throughput numbers that demonstrably show differences
which reflect more than random variation.   After running tests for 7
weekends (and millions of packets) I have some theories, but those keep
changing as different tests are run.

Craig

P.S.  Those millions of packets are almost all over a software loopback.
The contribution to network congestion has been small.
-----------[000039][next][prev][last][first]----------------------------------------------------
Date:      Sun, 14-Dec-86 14:28:26 EST
From:      farber@HUEY.UDEL.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Need info on PC/IP

There is also a mailing list
pcip@huey.udel.edu devoted to this subject

Dave

-----------[000040][next][prev][last][first]----------------------------------------------------
Date:      Sun, 14-Dec-86 15:46:14 EST
From:      craig@LOKI.BBN.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   TCP RTT woes revisited


    This weekend I had time to start processing Van Jacobson's suggested
fixes/modifications.  Things started working very well after the first fix
which made TCP choose better fragment sizes and increased the time to live
for IP fragments.

    The subsequent testing also revealed some interesting results.  (These
are preliminary and subject to reappraisal).

    (1) EACKs appear to make a huge difference in use of the network.
    After seeing signs this was the case, I ran the simple test of
    pushing 50,000 data packets though a software loopback that
    dropped 4% of the packets.
    
    With EACKs there were 1,930 retransmissions, of which 1 received
    packet was a duplicate (note that some of the retransmissions were
    also dropped).

    Without EACKS there were 12,462 retransmissions of which 9,344
    received packets were duplicates.

    12,462 retransmissions is, of course, bad news, and comes from
    the fact that this RDP sends up to four packets in parallel.
    Typically the four get put into the send queue in the same
    tick of the timer, so when the first gets retransmitted,
    all four do.  The moral seems to be use EACKs even though
    they aren't required for a conforming implementation.

    (2) Lixia Zhang's suggestion that one use the RTT of the SYN to
    compute the initial timeout estimate appears to work very well.

    (3) EACKs may make it possible to all but stomp out RTT feedback
    (those unfortunate cases where a dropped packet leds to an
    RTT = (the number of retries * SRTT) + SRTT being used to compute
    a new SRTT.  I've been experimenting with discarding RTTs for out of
    order acks.  This is best explained by example.  If packets 1, 2, 3
    and 4 are sent, and the first ack is an EACK for 3, the implementation
    uses the RTT for 3 to recompute the SRTT, but will discard the RTTs
    for 1 and 2 when they are eventually acked (or EACKed).  The
    argument in favor of this scheme is that the acks for 1, and 2
    probably represent either (a) RTTs for packets that were dropped,
    and thus including them would lead to feedback or (b) RTTs that reflect 
    an earlier (and slower) state of the network (3 was sent after 1 and 2)
    and using them would make the SRTT a less good prediction of the
    RTT of the next packet.  Note that (b) would be more convincing
    if it wasn't the case that 1, 2, 3 and 4 were probaby sent within
    a few milliseconds of each other.

    Watching 5 trial runs of 100 64-byte data packets bounced off Goonhilly
    this algorithm kept the SRTT within the observed range of real RTTs
    (as opposed to RTTs for packets that were dropped and had to be
    retransmitted).

    Using EACKs but taking the RTT for every packet, (again doing 5 trial
    runs) several cases of RTT-feedback were seen.  In one case the SRTT
    soared to ~35 seconds when a few packets were dropped in a short period.
    Since the implementation uses Mill's suggested changes which make
    lowering the SRTT take longer than raising it, the SRTT took some
    time to recover.

People may be wondering about observed throughput.  How fast does RDP
run vis-a-vis TCP?  That turns out to be very difficult to answer.
Identical tests run in parallel or one right after another give
throughput rates that vary by factors of 2 of more.  As a result it
is difficult to get throughput numbers that demonstrably show differences
which reflect more than random variation.   After running tests for 7
weekends (and millions of packets) I have some theories, but those keep
changing as different tests are run.

Craig

P.S.  Those millions of packets are almost all over a software loopback.
The contribution to network congestion has been small.

-----------[000041][next][prev][last][first]----------------------------------------------------
Date:      Sun, 14-Dec-86 19:39:04 EST
From:      minshall%opal.Berkeley.EDU@UCBVAX.BERKELEY.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   (none)

Regarding 3270 access from a PC, I think that John Romkey at FTP Software, Inc.
(near Boston or MIT or somewhere) is porting tn3270 to run on various
vendor's hardware (probably 3COM included).  Also, Ungermann-Bass is porting
it to run on THEIR smart PC board.  Also, UC Berkeley has a version that runs
on the UB smart board.

Any and all of these SHOULD talk to a whatever-it-is-you-have.

Greg Minshall

-----------[000042][next][prev][last][first]----------------------------------------------------
Date:      Sun 14 Dec 86 23:00:14-EST
From:      Dennis G. Perry <PERRY@VAX.DARPA.MIL>
To:        tcp-ip@SRI-NIC.ARPA
Cc:        perry@VAX.DARPA.MIL
Subject:   [Ken Pogran <pogran@ccq.bbn.com>: Recent ARPANET Performance]
The message below indicates what I am sure some of you have suspected,
the Arpanet has had a relaps.

dennis
                ---------------

Received: from ccq.bbn.com by vax.darpa.mil (4.12/4.7)
	id AA15979; Sun, 14 Dec 86 12:44:18 est
Message-Id: <8612141744.AA15979@vax.darpa.mil>
Date: Sun, 14 Dec 86 12:32:30 EST
From: Ken Pogran <pogran@ccq.bbn.com>
Subject: Recent ARPANET Performance
To: Perry@vax.darpa.mil
Cc: CStein@ccw.bbn.com, McKenzie@j.bbn.com, Mayersohn@alexander.bbn.com,
        FSerr@alexander.bbn.com, BDlugos@ccy.bbn.com, Bartlett@cct.bbn.com,
        pogran@ccq.bbn.com

Dennis,

Over the past few weeks, users may have noticed some degradation
of ARPANET performance over the level attained in early November.
BBN Communications Corporation believes that this change in
network performance correlates with outages in certain key
network lines; in particular, lines between CMU and DCEC and,
most recently and most significantly, TEXAS and BRAGG.  The
effect of these line outages, which are not uncommon events,
demonstrate that the ARPANET is still "on the edge", particularly
where cross-country bandwidth is concerned.

The primary measure that we have of the degradation in network
performance is the increase in congestion- or performance-related
"traps" or exception reports made by the C/30 packet switches in
the network to the Network Operaions Center.  We reported on
November 12 that, following several changes made in the network,
the number of traps diminished by an order of magnitude.
Recently, the number of traps has increased substantially,
indicating degraded performance, although the number of traps has
not risen to the extremely high levels seen earlier this fall.

The DDN PMO and BBNCC are taking action to correct the present
line outage problems.

Regards,
 Ken Pogran
 BBN Communications Corporation


The following data is provided by the Bob Pyle of the BBNCC
Network Analysis Department:

-----------------------------

From:    rpyle@cc5.bbn.com
Date:    10 Dec 86 13:49:26 EST (Wed)
Subject: Recent ARPANET performance

We can divide time since mid-October into four periods to see the
effect of various trunking problems on ARPANET performance:

  Oct 27 - Nov 6	19.2 kb bottleneck at USC still in effect
  Nov 7 - Nov 24	"Good" period
  Nov 25 - Dec 5	CMU-DCEC line out of service
  Dec 8 - Dec 9		CMU-DCEC and TEXAS-BRAGG lines out of service

Each of these periods is characterized by a more-or-less constant
production of performance-related traps with fairly abrupt transitions
between them (the last period is of course too short as yet to talk
about in statistical terms).  The most numerous trap is the 63 trap,
long wait for all8.  The table below shows the average number of 63
traps per day and the average total number of traps per day on the
ARPANET for the four periods (workdays only, no weekends, no
holidays).

 Time Period |  63 traps   |	total traps
---------------------------------------------
  10/27-11/6 |	72190	   |	101115
	     |		   |
  11/7-11/24 |	 4186	   |	  9954
	     |		   |
  11/25-12/5 |	11990	   |	 23681
	     |		   |
  12/8-12/9  |	34602	   |	 68352

-------
-------
-----------[000043][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15 Dec 86 02:19:25 est
From:      hedrick@topaz.rutgers.edu (Charles Hedrick)
To:        tcp-ip@sri-nic.arpa
Subject:   Arpanet outage
A few days ago, the East Coast vanished.  We talked to NOC about it.
I didn't follow everything they said, so I could be misinterpreting
it.  But what I thought I heard was that the Arpanet had fractured
into a number of small, self-contained pieces.  This was due to a
single physical cable break.  NOC itself was cut off from most of the
rest of the network, as were we (using an IMP in New York City).  One
would have thought that the Arpanet had enough redundancy that a
single cable break would not isolate an IMP, and even if it did, that
it would not do this for both NYC and Boston.  I had hoped that
someone more in the know would give us a more detailed report of what
happened.  Is this going to happen?

-----------[000044][next][prev][last][first]----------------------------------------------------
Date:      Mon 15 Dec 86 05:42:30-EST
From:      Dennis G. Perry <PERRY@VAX.DARPA.MIL>
To:        hedrick@TOPAZ.RUTGERS.EDU
Cc:        tcp-ip@SRI-NIC.ARPA, perry@VAX.DARPA.MIL
Subject:   Re: Arpanet outage
I will follow up on your comment and report back.  Please see my previous
report of yesterday on Arpanet relaps.  That doesn't seem to explain
what you are talking about.

dennis
-------
-----------[000045][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15 Dec 86 13:47:40 -0500
From:      Craig Partridge <craig@loki.bbn.com>
To:        walsh@harvard.harvard.edu
Cc:        tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   Re: TCP RTT woes revisited

> Have you thought of using a separate variable to measure the RTT of each
> packet so that you can update you smoothed RTT using the EACKs?

    That's precisely what I'm doing.  Then the out-of-order rule is used
to discard RTTs that seem likely to cause SRTT explosion.

> When I last did RDP work, RDP and TCP were roughly the same speed.  Maybe
> RDP was a bit quicker even in the LAN environment.  The reason RDP did
> not dominate TCP was that the machines I was using were VAXes and the
> RDP checksumming algorithm did not run as fast as it would on a machine
> with a different byte ordering (like the 68K based workstations).

    Certainly the RDP checksum on the VAX is a real problem.  On the
SUN the checksum I use is 40% faster than the TCP checksum;  on the
VAX the checksum is about 3 times *slower* than the TCP checksum. (You
probably wrote a better one, I haven't compared them).  And over a
perfect network, the checksum performance seems to dictate speed.

    But once there is any packet loss on the network the data handling
costs seem to become rather insignifigant, and the big issue (I believe)
is retransmission mechanisms.   Unfortunately, once the network drops
packets, there seems to be a very wide variation in throughput from
test to test and it gets hard to say anything definitive.  There's also
the problem of, when you get a definitive answer, is it a real difference,
or merely demonstrating an odd quirk of the particular RDP or TCP
implementation?   (I.e. am I asking the right question?) One quickly
develops a healthy respect for TCP.

Craig
-----------[000046][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 10:46:48 EST
From:      malis@CCS.BBN.COM (Andrew Malis)
To:        mod.protocols.tcp-ip
Subject:   Re: Arpanet outage

Charles,

What happened is as follows:

At 1:11 AM EST on Friday, AT&T suffered a fiber optics cable
break between Newark NJ and White Plains NY.  They happen to have
routed seven ARPANET trunks through that one fiber optics cable.
When the cable was cut, all seven trunks were lost, and the PSNs
in the northeast were cut off from the rest of the network.
Service was restored by AT&T at 12:12.

The MILNET also suffered some trunk outages, but has more
reduncancy, so it was not partitioned.

Regards,
Andy Malis

-----------[000047][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15 Dec 86 10:46:48 EST
From:      Andrew Malis <malis@ccs.bbn.com>
To:        hedrick@topaz.rutgers.edu
Cc:        tcp-ip@sri-nic.arpa, malis@ccs.bbn.com
Subject:   Re: Arpanet outage
Charles,

What happened is as follows:

At 1:11 AM EST on Friday, AT&T suffered a fiber optics cable
break between Newark NJ and White Plains NY.  They happen to have
routed seven ARPANET trunks through that one fiber optics cable.
When the cable was cut, all seven trunks were lost, and the PSNs
in the northeast were cut off from the rest of the network.
Service was restored by AT&T at 12:12.

The MILNET also suffered some trunk outages, but has more
reduncancy, so it was not partitioned.

Regards,
Andy Malis
-----------[000048][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 13:04:49 EST
From:      walsh@HARVARD.HARVARD.EDU (Bob Walsh)
To:        mod.protocols.tcp-ip
Subject:   Re:  TCP RTT woes revisited

Craig,

Have you thought of using a separate variable to measure the RTT of each
packet so that you can update you smoothed RTT using the EACKs?

When I last did RDP work, RDP and TCP were roughly the same speed.  Maybe
RDP was a bit quicker even in the LAN environment.  The reason RDP did
not dominate TCP was that the machines I was using were VAXes and the
RDP checksumming algorithm did not run as fast as it would on a machine
with a different byte ordering (like the 68K based workstations).

bob

-----------[000049][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15 Dec 86 13:04:49 EST
From:      Bob Walsh <walsh@harvard.harvard.edu>
To:        craig@loki.bbn.com, tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   Re:  TCP RTT woes revisited
Craig,

Have you thought of using a separate variable to measure the RTT of each
packet so that you can update you smoothed RTT using the EACKs?

When I last did RDP work, RDP and TCP were roughly the same speed.  Maybe
RDP was a bit quicker even in the LAN environment.  The reason RDP did
not dominate TCP was that the machines I was using were VAXes and the
RDP checksumming algorithm did not run as fast as it would on a machine
with a different byte ordering (like the 68K based workstations).

bob
-----------[000050][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 16:27:07 EST
From:      robert@SPAM.ISTC.SRI.COM (Robert Allen)
To:        mod.protocols.tcp-ip
Subject:   Protcol Development on SUN 2 and 3 computers.


Pardon me for posting with a non-TCP/IP related subject, I have
no good excuse...


	I'm wondering if anyone has attempted to develop other
protocols on Sun computers using the "open-architecture".  From
initial inspection of the Sun document "Network Implementation"
it appears that one can provide different protocol routines at
various layers, and make use of the kernel hooks built into the
system, thus provideing socket-type interfaces for protocols other
than the currently supported TCP/IP and UDP (I knew I could make
this letter pertinent).

	Specifically, I would like to know if; a) anyone has tried
this with other protocols, and if so then which protocols, b) which
layers are supported in this open architecture, and c) what problems
were encountered if any.

	Any comments, questions, pointers, etc. are appreciated.



						Robert Allen
						robert@spam.istc.sri.com
						OR
						robert@sri-spam.ARPA

-----------[000051][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15 Dec 86 19:57:34 pst
From:      John B. Nagle <jbn@glacier.stanford.edu>
To:        TCP-IP@SRI-NIC
Subject:   Regarding extended ACKs under congested conditions

       A proper TCP should, under heavily congested conditions reported
to it via ICMP Source Quench messages, throttle back to one packet outstanding
at a time, and become a stop-and-wait protocol.  At that point, the need for
extended ACks vanishes.  However, if one is operating over an uncongested
but error-prone link, such as a packet radio link without link-level error
control, something like extended ACKs would not be a bad idea.  

       Clearly fragmenting in the presence of errors introduces terrible
problems.  Some of these problems are alleviated if the IP sequence number
is retained over retransmissions, so that fragments generated from multiple
retransmissions of the same TCP segment can be reassembled.  I still argue,
though, that nothing should send a packet bigger than 576 bytes without
some external reason to know that it won't be fragmented.  The IP standard,
of course, says that all nets must handle 576 byte datagrams.  There's still
some antiquated hardware around that can't, but it's a miniscule percentage
of the net today.

       But none of these matters address the real issue that is killing the
net, which is, of course, the combination of badly-behaved hosts and gateways
that can't defend themselves against overload.  But we've been through this
before, and I've written enough on that subject previously.  Still, it's 
embarassing that it hasn't been fixed yet.

				John Nagle
-----------[000052][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 18:07:57 EST
From:      mills@HUEY.UDEL.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  TCP RTT woes revisited

Craig and Bob,

Keeping roundtrip-delay samples on a per-packet basis really does help
(the fuzzballs have been doing that for several years), as does initializing
the estimator with the SYN/ACK exchange. Another thing, first pointed out
by Jack Haverty of BBN, is the behavior when the window first opens
after it has previously closed. If the ACK carrying the window update is
lost, performance can lose big. This may be one reason TP-4 uses a different
"active ACK" policy. While at it, consider the receiver policy and when
to generate ACKs (delayed or not). Silly implementations that always send
2-3 ACKs for every received packet might actually win under warmonger
conditions.

Dave

-----------[000053][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15 Dec 86 18:07:57 EST
From:      mills@huey.udel.edu
To:        tcp-ip%sri-nic.arpa@SH.CS.NET, Bob Walsh <walsh@harvard.harvard.edu>, craig@loki.bbn.com
Subject:   Re:  TCP RTT woes revisited
Craig and Bob,

Keeping roundtrip-delay samples on a per-packet basis really does help
(the fuzzballs have been doing that for several years), as does initializing
the estimator with the SYN/ACK exchange. Another thing, first pointed out
by Jack Haverty of BBN, is the behavior when the window first opens
after it has previously closed. If the ACK carrying the window update is
lost, performance can lose big. This may be one reason TP-4 uses a different
"active ACK" policy. While at it, consider the receiver policy and when
to generate ACKs (delayed or not). Silly implementations that always send
2-3 ACKs for every received packet might actually win under warmonger
conditions.

Dave
-----------[000054][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 19:31:52 EST
From:      craig@LOKI.BBN.COM (Craig Partridge)
To:        mod.protocols.tcp-ip
Subject:   Re: TCP RTT woes revisited


> Have you thought of using a separate variable to measure the RTT of each
> packet so that you can update you smoothed RTT using the EACKs?

    That's precisely what I'm doing.  Then the out-of-order rule is used
to discard RTTs that seem likely to cause SRTT explosion.

> When I last did RDP work, RDP and TCP were roughly the same speed.  Maybe
> RDP was a bit quicker even in the LAN environment.  The reason RDP did
> not dominate TCP was that the machines I was using were VAXes and the
> RDP checksumming algorithm did not run as fast as it would on a machine
> with a different byte ordering (like the 68K based workstations).

    Certainly the RDP checksum on the VAX is a real problem.  On the
SUN the checksum I use is 40% faster than the TCP checksum;  on the
VAX the checksum is about 3 times *slower* than the TCP checksum. (You
probably wrote a better one, I haven't compared them).  And over a
perfect network, the checksum performance seems to dictate speed.

    But once there is any packet loss on the network the data handling
costs seem to become rather insignifigant, and the big issue (I believe)
is retransmission mechanisms.   Unfortunately, once the network drops
packets, there seems to be a very wide variation in throughput from
test to test and it gets hard to say anything definitive.  There's also
the problem of, when you get a definitive answer, is it a real difference,
or merely demonstrating an odd quirk of the particular RDP or TCP
implementation?   (I.e. am I asking the right question?) One quickly
develops a healthy respect for TCP.

Craig

-----------[000055][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 19:38:44 EST
From:      mtasman@CCT.BBN.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: 4.3 telnet/break


     I believe that some versions of the TAC will transmit an "IAC BREAK" when
a user hits the "BREAK" key.

     This may explain why 4.3 telnet servers occasionally receive "IAC BREAK"
on TAC connections.

-----------[000056][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15 Dec 86 19:38:44 EST
From:      Mitch Tasman <mtasman@cct.bbn.com>
To:        tcp-ip@sri-nic.arpa
Subject:   Re: 4.3 telnet/break
     I believe that some versions of the TAC will transmit an "IAC BREAK" when
a user hits the "BREAK" key.

     This may explain why 4.3 telnet servers occasionally receive "IAC BREAK"
on TAC connections.
-----------[000057][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 20:39:58 EST
From:      LYNCH@A.ISI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Arpanet outage

Andy,  How on earth does it come to happen that 7 "trunks" are
"routed" through one fiber optics cable?  My idea of a cable 
encompasses both some dedicated bandwidth and some physical
isolation.  How can a network planner ever be sure of "redundancy"
if the providers of moving bits do these kinds of things to their
customers?

Dan
-------

-----------[000058][next][prev][last][first]----------------------------------------------------
Date:      15 Dec 1986 20:39:58 EST
From:      Dan Lynch <LYNCH@A.ISI.EDU>
To:        Andrew Malis <malis@CCS.BBN.COM>
Cc:        hedrick@TOPAZ.RUTGERS.EDU, tcp-ip@SRI-NIC.ARPA, LYNCH@A.ISI.EDU
Subject:   Re: Arpanet outage
Andy,  How on earth does it come to happen that 7 "trunks" are
"routed" through one fiber optics cable?  My idea of a cable 
encompasses both some dedicated bandwidth and some physical
isolation.  How can a network planner ever be sure of "redundancy"
if the providers of moving bits do these kinds of things to their
customers?

Dan
-------
-----------[000059][next][prev][last][first]----------------------------------------------------
Date:      Mon, 15-Dec-86 22:57:34 EST
From:      jbn@GLACIER.STANFORD.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Regarding extended ACKs under congested conditions


       A proper TCP should, under heavily congested conditions reported
to it via ICMP Source Quench messages, throttle back to one packet outstanding
at a time, and become a stop-and-wait protocol.  At that point, the need for
extended ACks vanishes.  However, if one is operating over an uncongested
but error-prone link, such as a packet radio link without link-level error
control, something like extended ACKs would not be a bad idea.  

       Clearly fragmenting in the presence of errors introduces terrible
problems.  Some of these problems are alleviated if the IP sequence number
is retained over retransmissions, so that fragments generated from multiple
retransmissions of the same TCP segment can be reassembled.  I still argue,
though, that nothing should send a packet bigger than 576 bytes without
some external reason to know that it won't be fragmented.  The IP standard,
of course, says that all nets must handle 576 byte datagrams.  There's still
some antiquated hardware around that can't, but it's a miniscule percentage
of the net today.

       But none of these matters address the real issue that is killing the
net, which is, of course, the combination of badly-behaved hosts and gateways
that can't defend themselves against overload.  But we've been through this
before, and I've written enough on that subject previously.  Still, it's 
embarassing that it hasn't been fixed yet.

				John Nagle

-----------[000060][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 04:00:58 EST
From:      mtxuucp@seismo.CSS.GOV@elsie.UUCP (mt Xinu uucp login)
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: elsie!cvl!mimsy!mark
From: mark@mimsy.UUCP (Mark Weiser)
Newsgroups: mod.protocols.tcp-ip
Subject: Re: Protcol Development on SUN 2 and 3 computers.
Message-ID: <4760@mimsy.UUCP>
Date: 16 Dec 86 07:39:39 GMT
References: <8612152127.AA05241@spam.istc.sri.com>
Reply-To: mark@mimsy.UUCP (Mark Weiser)
Organization: Computer Sci. Dept, U of Maryland, College Park, MD
Lines: 22

In article <8612152127.AA05241@spam.istc.sri.com> robert@SPAM.ISTC.SRI.COM (Robert Allen) writes:
>
>	I'm wondering if anyone has attempted to develop other
>protocols on Sun computers using the "open-architecture".  From
>initial inspection of the Sun document "Network Implementation"
>it appears that one can provide different protocol routines at
>various layers, and make use of the kernel hooks built into the
>system, thus provideing socket-type interfaces for protocols other
>than the currently supported TCP/IP and UDP ...
>
Actually, Suns open network architecture is just Berkeley 4.2bsd's
architecture, which is not so open if you try to do a really different
protocol, as we did with XNS.  4.3bsd Unix is much more amenable to
different protocols, and we have (twice now) stripped out Sun's
networking code and replaced it with the 4.3 code, in order to
include 4.3's XNS support.

-mark
-- 
Spoken: Mark Weiser 	ARPA:	mark@mimsy.cs.umd	Phone: +1-301-454-7817
CSNet:	mark@mimsy 	UUCP:	{seismo,allegra}!mimsy!mark
USPS: Computer Science Dept., University of Maryland, College Park, MD 20742

-----------[000061][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 04:17:00 EST
From:      VTTTELTY@FINFUN.BITNET.UUCP
To:        mod.protocols.tcp-ip
Subject:   Ethernet driver for INTEL SYS310


ETHERNET DRIVER NEEDED FOR INTEL SYS310/RMX86/iSBC552

I have implemented an IP-level gateway to be used between Ethernet type
LAN's. The gateways communicate through X.25 network.

The software runs on the Intel SYS310 (with RMX86) and
the LAN-interface card is Intel's iSBC552 (with INA961 software).

The problem is that the INA961 software understands only IEEE 802.3
conforming packets, while most other devices on the LAN talk Ethernet.

I am interested in knowing if anyone else has had the same problem and
has anyone found good solutions.
My current solution is to use a VAX (VMS) as a gateway between Ethernet and
IEEE 802.3.


Santtu Maki
Telecommunications laboratory
Technical Research Centre of Finland
Otakaari 7B
SF-02150 Espoo, Finland

VTTTELTY@FINFUN.BITNET

-----------[000062][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 06:15:08 EST
From:      uucp@SEISMO.CSS.GOV.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: seismo!mimsy!mark
From: mark@mimsy.UUCP (Mark Weiser)
Newsgroups: mod.protocols.tcp-ip
Subject: Re: Protcol Development on SUN 2 and 3 computers.
Message-ID: <4760@mimsy.UUCP>
Date: 16 Dec 86 07:39:39 GMT
References: <8612152127.AA05241@spam.istc.sri.com>
Reply-To: mark@mimsy.UUCP (Mark Weiser)
Organization: Computer Sci. Dept, U of Maryland, College Park, MD
Lines: 22

In article <8612152127.AA05241@spam.istc.sri.com> robert@SPAM.ISTC.SRI.COM (Robert Allen) writes:
>
>	I'm wondering if anyone has attempted to develop other
>protocols on Sun computers using the "open-architecture".  From
>initial inspection of the Sun document "Network Implementation"
>it appears that one can provide different protocol routines at
>various layers, and make use of the kernel hooks built into the
>system, thus provideing socket-type interfaces for protocols other
>than the currently supported TCP/IP and UDP ...
>
Actually, Suns open network architecture is just Berkeley 4.2bsd's
architecture, which is not so open if you try to do a really different
protocol, as we did with XNS.  4.3bsd Unix is much more amenable to
different protocols, and we have (twice now) stripped out Sun's
networking code and replaced it with the 4.3 code, in order to
include 4.3's XNS support.

-mark
-- 
Spoken: Mark Weiser 	ARPA:	mark@mimsy.cs.umd	Phone: +1-301-454-7817
CSNet:	mark@mimsy 	UUCP:	{seismo,allegra}!mimsy!mark
USPS: Computer Science Dept., University of Maryland, College Park, MD 20742

-----------[000063][next][prev][last][first]----------------------------------------------------
Date:      16 Dec 1986 08:04-EST
From:      CERF@A.ISI.EDU
To:        LYNCH@A.ISI.EDU
Cc:        malis@CCS.BBN.COM, hedrick@TOPAZ.RUTGERS.EDU tcp-ip@SRI-NIC.ARPA
Subject:   Re: Arpanet outage
Dan,

the physical diversity problem is real and not easy to solve. First, any
particular supplier of long-haul service may not have a great deal of
physical diversity available. One would have to go to different vendors
to achieve it. The procurement of communication services by DoD does not
always (ever?) rate physical diversity at a high enough level of priority
to justify paying different amounts of money for the "same" service to
achieve diversity. I had a similar problem at MCI and had to specifically
work with the voice network engineering and operations staff to make them
understand how critical physical diversity was to a packet network.

With fiber, this is particualrly a problem because the economics of fiber
make it very attractive to dig a trench and put lots of fiber in the one
trench rather than digging many different trenches.

Vint
-----------[000064][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 08:09:50 EST
From:      haverty@CCV.BBN.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Arpanet outage


Dan,

It's misleading to think that you are ordering a "trunk" from a
communications supplier.  What you are buying is a plug at one
site through which you can pass bits, which appear by some magic
at the plug you have bought at the other site.  Assuming that
there is a physical wire between the two with any particular
characteristics other than what is specified in the service
offering (e.g., BER, speed, conditioning) is a dangerous
practice.

The nice network maps we all draw are topological, not physical.
We've often deduced physical characteristics from observed
behavior, and seen this kind of thing in many networks.  I
remember one in particular that had a microwave "sweeper" on a
tower, which swept a beam in a circle to hit N other microwave
stations around the horizon; the observed effect of this was a
propagation delay of about 100 msec., which is far too short for
any normal satellite trunk, and far too long for any normal
terrestrial circuit.  I also remember a backhoe in a farmer's
field in Illinois which dug up N of our carefully redundantized
trunks with a single flip of the scoop.

I think in most cases even if you figure out something about the
physical implementation, there is no guarantee that it will be
the same next week.   Vendors do offer some options that you can
specify, usually at extra cost, like a guaranteed terrestrial
routing to control delay; I think you can also specify separate
physical routes for different circuits in some cases.

Jack

-----------[000065][next][prev][last][first]----------------------------------------------------
Date:      16 Dec 86 08:09:50 EST (Tue)
From:      haverty@ccv.bbn.com
To:        Dan Lynch <LYNCH@a.isi.edu>
Cc:        Andrew Malis <malis@ccs.bbn.com>, hedrick@topaz.rutgers.edu, tcp-ip@sri-nic.ARPA, haverty@ccv.bbn.com
Subject:   Re: Arpanet outage

Dan,

It's misleading to think that you are ordering a "trunk" from a
communications supplier.  What you are buying is a plug at one
site through which you can pass bits, which appear by some magic
at the plug you have bought at the other site.  Assuming that
there is a physical wire between the two with any particular
characteristics other than what is specified in the service
offering (e.g., BER, speed, conditioning) is a dangerous
practice.

The nice network maps we all draw are topological, not physical.
We've often deduced physical characteristics from observed
behavior, and seen this kind of thing in many networks.  I
remember one in particular that had a microwave "sweeper" on a
tower, which swept a beam in a circle to hit N other microwave
stations around the horizon; the observed effect of this was a
propagation delay of about 100 msec., which is far too short for
any normal satellite trunk, and far too long for any normal
terrestrial circuit.  I also remember a backhoe in a farmer's
field in Illinois which dug up N of our carefully redundantized
trunks with a single flip of the scoop.

I think in most cases even if you figure out something about the
physical implementation, there is no guarantee that it will be
the same next week.   Vendors do offer some options that you can
specify, usually at extra cost, like a guaranteed terrestrial
routing to control delay; I think you can also specify separate
physical routes for different circuits in some cases.

Jack
-----------[000066][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 08:30:23 EST
From:      steve@BRILLIG.UMD.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Protcol Development on SUN 2 and 3 computers.


   The Sun networking implementation is very close to being identical to
the standard 4.2BSD implementation.  Unfortunately, that makes development
of other protocols (unless they live on top of IP, in which case it's
not bad to do) more troublesome than you might expect as, if memory
serves, the networking implementation manual (also lifted from a 4.2BSD
manual) is incorrect (or, perhaps, misleading) in terms of talking
about its protocol independence.  There are all sorts of nasty AF_INET
dependencies lurking about in there, everywhere from the device drivers
to the network interface and routing code to NFS.  It is possible to
track all these dependencies down -- Chris Torek and James O'Toole
did it here when they did their Xerox NS implementation for 4.2BSD --
but it probably won't be a whole lot of fun to do.

   Back under Sun Unix 2.0 I hacked some XNS support into the kernel.
The way I did it was to remark, "gee, the interface between the network
code and the rest of the kernel isn't so bad" and stuff the whole of the
4.3BSD beta networking code into the kernel, throwing the Sun/4.2BSD code
out.  Depending on what you're doing, that may be a win.  I believe
that it was for me, as I didn't have to write a NS implementation if
I worked it that way.  Furthermore, changing the INET-dependent code
is probably not particulary hard, but you'll have to muck with the
innards of almost every kernel module in /sys/net*, and that could
be both tedious and frustrating.  Finally, the 4.3BSD networking
implementation is very much improved over the 4.2BSD one in the area
of TCP/IP, so you get a better TCP/IP in the bargain.

   Oh yes, and of course it looks easier to stuff an entirely new
(non-INET, non-NS) protocol into 4.3BSD than it does into 4.2BSD.
There can't be too many dependencies still lurking about, 'cause
the NS support works.

   Hope this is of use to you.

	-Steve

Spoken: Steve Miller 	ARPA:	steve@mimsy.umd.edu	Phone: +1-301-454-4251
CSNet:	steve@mimsy.umd.edu 	UUCP:	{seismo,allegra}!mimsy!steve
USPS: Computer Science Dept., University of Maryland, College Park, MD 20742

-----------[000067][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 09:13:04 EST
From:      pogran@CCQ.BBN.COM (Ken Pogran)
To:        mod.protocols.tcp-ip
Subject:   Re: Arpanet outage

Dan,

Your question to Andy, "How on earth does it come to happen that
7 'trunks' are 'routed' through one fiber optics cable" is more
properly addressed to the common carriers whose circuits the
ARPANET uses, rather than to the packet switching folks.

Here we are in the world of circuits leased from common carriers,
where economies of scale (for the carriers!) imply very high
degrees of multiplexing.  As the customer of a common carrier,
you specify the end points that you'd like for the circuit, and
the carrier routes it as he sees fit.  This is a personal
opinion, and not a BBNCC official position, but I think it's safe
to say that without spending a lot of extra money, and citing
critical national defense needs, it's going to be hard to get a
carrier to promise -- and achieve!  -- diverse physical routings
for a given set of leased circuits.  I would also venture the
opinion that there are lots of places in the U. S. where there's
only one physical transmission system coming into the area that
can provide the 56 Kb/s Digital Data Service that the ARPANET (and
MILNET, and ...) uses.

An implication of this is that almost any wide-area network
(doesn't matter whose, or what the technology is) is going to be
somewhat more vulnerable to having nodes isolated than its
logical map would suggest.

In fairness to the common carriers (are there any in the
audience?), the higher the degree of multiplexing, the more
well-protected the carrier's facilities are, and the more
attention is paid to issues of automatic backup (carriers call
this "protection") and longer-term rerouting of circuits when
there's an outage (carriers call this "restoration").  So an
outage of the type that's been discussed ought to be a very
low-prob event.  Kind of like wide-spread power failures ...

Hope this discussion helps.

Ken Pogran

-----------[000068][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16 Dec 86  9:13:04 EST
From:      Ken Pogran <pogran@ccq.bbn.com>
To:        Dan Lynch <LYNCH@a.isi.edu>
Cc:        Andrew Malis <malis@ccs.bbn.com>, hedrick@topaz.rutgers.edu, tcp-ip@sri-nic.arpa, pogran@ccq.bbn.com
Subject:   Re: Arpanet outage
Dan,

Your question to Andy, "How on earth does it come to happen that
7 'trunks' are 'routed' through one fiber optics cable" is more
properly addressed to the common carriers whose circuits the
ARPANET uses, rather than to the packet switching folks.

Here we are in the world of circuits leased from common carriers,
where economies of scale (for the carriers!) imply very high
degrees of multiplexing.  As the customer of a common carrier,
you specify the end points that you'd like for the circuit, and
the carrier routes it as he sees fit.  This is a personal
opinion, and not a BBNCC official position, but I think it's safe
to say that without spending a lot of extra money, and citing
critical national defense needs, it's going to be hard to get a
carrier to promise -- and achieve!  -- diverse physical routings
for a given set of leased circuits.  I would also venture the
opinion that there are lots of places in the U. S. where there's
only one physical transmission system coming into the area that
can provide the 56 Kb/s Digital Data Service that the ARPANET (and
MILNET, and ...) uses.

An implication of this is that almost any wide-area network
(doesn't matter whose, or what the technology is) is going to be
somewhat more vulnerable to having nodes isolated than its
logical map would suggest.

In fairness to the common carriers (are there any in the
audience?), the higher the degree of multiplexing, the more
well-protected the carrier's facilities are, and the more
attention is paid to issues of automatic backup (carriers call
this "protection") and longer-term rerouting of circuits when
there's an outage (carriers call this "restoration").  So an
outage of the type that's been discussed ought to be a very
low-prob event.  Kind of like wide-spread power failures ...

Hope this discussion helps.

Ken Pogran
-----------[000069][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 09:31:55 EST
From:      mark@MARKSSUN.CS.UMD.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   (none)

In article <8612152127.AA05241@spam.istc.sri.com> robert@SPAM.ISTC.SRI.COM (Robert Allen) writes:
>
>	I'm wondering if anyone has attempted to develop other
>protocols on Sun computers using the "open-architecture".  From
>initial inspection of the Sun document "Network Implementation"
>it appears that one can provide different protocol routines at
>various layers, and make use of the kernel hooks built into the
>system, thus provideing socket-type interfaces for protocols other
>than the currently supported TCP/IP and UDP ...
>
Actually, Suns open network architecture is just Berkeley 4.2bsd's
architecture, which is not so open if you try to do a really different
protocol, as we did with XNS.  4.3bsd Unix is much more amenable to
different protocols, and we have (twice now) stripped out Sun's
networking code and replaced it with the 4.3 code, in order to
include 4.3's XNS support.

-mark
-- 
Spoken: Mark Weiser 	ARPA:	mark@mimsy.cs.umd	Phone: +1-301-454-7817
CSNet:	mark@mimsy 	UUCP:	{seismo,allegra}!mimsy!mark
USPS: Computer Science Dept., University of Maryland, College Park, MD 20742

-----------[000070][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 12:16:23 EST
From:      malis@CCS.BBN.COM (Andrew Malis)
To:        mod.protocols.tcp-ip
Subject:   Re: Arpanet outage

Dan,

One additional point - I believe most (if not all) of the
affected trunks have been in service since before AT&T started
using fiber optics.  AT&T has obviously been rerouting their
existing circuits as cheaper transmission paths become available.
For some of the older ARPANET/MILNET trunks, I'm sure they've
seen the complete transition from wires to microwave to fiber
(and who knows what else).

Andy

-----------[000071][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16 Dec 86 12:16:23 EST
From:      Andrew Malis <malis@ccs.bbn.com>
To:        Dan Lynch <LYNCH@a.isi.edu>
Cc:        Andrew Malis <malis@ccs.bbn.com>, hedrick@topaz.rutgers.edu, tcp-ip@sri-nic.arpa
Subject:   Re: Arpanet outage
Dan,

One additional point - I believe most (if not all) of the
affected trunks have been in service since before AT&T started
using fiber optics.  AT&T has obviously been rerouting their
existing circuits as cheaper transmission paths become available.
For some of the older ARPANET/MILNET trunks, I'm sure they've
seen the complete transition from wires to microwave to fiber
(and who knows what else).

Andy
-----------[000072][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 13:01:55 EST
From:      melohn@SUN.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Protocol Development on SUN 2 and 3 computers.

Speaking as an admittedly biased source, the Sun Datacomm group has
managed to implement OSI (MAP/TOP) with a minimum of kernel changes
using the basic protosw, ioctl facility, and even the routing table
routines from the standard SunOS. We also have implementations of X.25,
SNA, and DECnet which all use the ifnet structure to layer different
protocol instances on top of different datalinks (HDLC, SDLC, 802.x).
As you might expect, the farther your protocol is from the TCP/IP
model, the less useful the standard networking code will be.

If your goal is XNS under Unix, it makes sense to use the 4.3
implementation.  As a general platform for protocol/network
development, we believe SunOS offers most of the facilities you
need.

-----------[000073][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16 Dec 86 10:17 N
From:      <VTTTELTY%FINFUN.BITNET@WISCVM.WISC.EDU>
To:        tcp-ip@sri-nic.arpa
Subject:   Ethernet driver for INTEL SYS310

ETHERNET DRIVER NEEDED FOR INTEL SYS310/RMX86/iSBC552

I have implemented an IP-level gateway to be used between Ethernet type
LAN's. The gateways communicate through X.25 network.

The software runs on the Intel SYS310 (with RMX86) and
the LAN-interface card is Intel's iSBC552 (with INA961 software).

The problem is that the INA961 software understands only IEEE 802.3
conforming packets, while most other devices on the LAN talk Ethernet.

I am interested in knowing if anyone else has had the same problem and
has anyone found good solutions.
My current solution is to use a VAX (VMS) as a gateway between Ethernet and
IEEE 802.3.


Santtu Maki
Telecommunications laboratory
Technical Research Centre of Finland
Otakaari 7B
SF-02150 Espoo, Finland

VTTTELTY@FINFUN.BITNET
-----------[000074][next][prev][last][first]----------------------------------------------------
Date:      16 Dec 1986 2058-PST (Tuesday)
From:      Keith Lantz <lantz@gregorio.stanford.edu>
To:        tcp-ip@sri-nic.ARPA
Subject:   Re:  Protcol Development on SUN 2 and 3 computers.
Folks might also be interested to know that protocol development in
Berkeley UNIX has been rather easy for years at CMU and Stanford, who
jointly developed what is referred to as the "packet filter".  A paper
on the packet filter, by Jeff Mogul, Mike Accetta, and Rick Rashid was
just presented at the Conference on Practical Software Development
Environments.  Perhaps the first thing to know is that it provides for
application-level protocol development, rather than kernel hacking.
For example, that's how our ``UNIX server'' for the V-System is
implemented.

We have been beating on Berkeley for several years to include same with
the BSD distributions, with little success.  Rumor has it that it IS
included in the 4.3 distribution, but as unsupported software.  I am
not offering to support it myself, but if you're sufficiently
interested and vocal enough, who knows who might respond...

Keith

Following is the man page for the 4.3 version of the packet filter.
The 4.2 version differs somewhat.  




ENET(4)             UNIX Programmer's Manual              ENET(4)



NAME
     enet - ethernet packet filter

SYNOPSIS
     pseudo-device enetfilter 64

DESCRIPTION
     The packet filter provides a raw interface to Ethernets and
     similar network data link layers.  Packets received that are
     not used by the kernel (i.e., to support IP, ARP, and on
     some systems XNS, protocols) are available through this
     mechanism.  The packet filter appears as a set of character
     special files, one per hardware interface.  Each enet file
     may be opened multiple times, allowing each interface to be
     used by many processes.  The total number of open ethernet
     files is limited to the value given in the kernel configura-
     tion; the example given in the SYNOPSIS above sets the limit
     to 64.

     The minor device numbers are associated with interfaces when
     the system is booted.  Minor device 0 is associated with the
     first Ethernet interface ``attached'', minor device 1 with
     the second, and so forth.  (These character special files
     are, for historical reasons, given the names /dev/enet0,
     /dev/eneta0, /dev/enetb0, etc.)

     Associated with each open instance of an enet file is a
     user-settable packet filter which is used to deliver incom-
     ing ethernet packets to the appropriate process.  Whenever a
     packet is received from the net, successive packet filters
     from the list of filters for all open enet files are applied
     to the packet.  When a filter accepts the packet, it is
     placed on the packet input queue of the associated file.  If
     no filters accept the packet, it is discarded.  The format
     of a packet filter is described below.

     Reads from these files return the next packet from a queue
     of packets that have matched the filter.  If insufficient
     buffer space to store the entire packet is specified in the
     read, the packet will be truncated and the trailing contents
     lost.  Writes to these devices transmit packets on the net-
     work, with each write generating exactly one packet.

     The packet filter currently supports a variety of different
     ``Ethernet'' data-link levels:

     3mb Ethernet   packets consist of 4 or more bytes with the
                    first byte specifying the source ethernet
                    address, the second byte specifying the des-
                    tination ethernet  address, and the next two
                    bytes specifying the packet type.  (Actually,
                    on the network the source and destination



Printed 9/6/86           8 October 1985                         1






ENET(4)             UNIX Programmer's Manual              ENET(4)



                    addresses are in the opposite order.)

     byte-swapping 3mb Ethernet
                    packets consist of 4 or more bytes with the
                    first byte specifying the source ethernet
                    address, the second byte specifying the des-
                    tination ethernet address, and the next two
                    bytes specifying the packet type.  Each short
                    word (pair of bytes) is swapped from the net-
                    work byte order; this device type is only
                    provided as a concession to backwards-
                    compatibility.

     10mb Ethernet  packets consist of 14 or more bytes with the
                    first six bytes specifying the destination
                    ethernet address, the next six bytes the
                    source ethernet address, and the next two
                    bytes specifying the packet type.

     The remaining words are interpreted according to the packet
     type.  Note that 16-bit and 32-bit quantities may have to be
     byteswapped (and possible short-swapped) to be intelligible
     on a Vax.

     The packet filter mechanism does not know anything about the
     data portion of the packets it sends and receives.  The user
     must supply the headers for transmitted packets (although
     the system makes sure that the source address is correct)
     and the headers of received packets are delivered to the
     user.  The packet filters treat the entire packet, including
     headers, as uninterpreted data.

IOCTL CALLS
     In addition to FIONREAD, ten special ioctl calls may be
     applied to an open enet file.  The first two set and fetch
     parameters for the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, param)
          struct eniocb *param;

     where param is defined in <sys/enet.h> as:

          struct eniocb
          {
                 u_char  en_addr;
                 u_char  en_maxfilters;
                 u_char  en_maxwaiting;
                 u_char  en_maxpriority;
                 long    en_rtout;
          };



Printed 9/6/86           8 October 1985                         2






ENET(4)             UNIX Programmer's Manual              ENET(4)



     with the applicable codes being:

     EIOCGETP
          Fetch the parameters for this file.

     EIOCSETP
          Set the parameters for this file.

     The maximum filter length parameter en_maxfilters indicates
     the maximum possible packet filter command list length (see
     EIOCSETF below).  The maximum input wait queue size parame-
     ter en_maxwaitingindicates the maximum number of packets
     which may be queued for an ethernet file at one time (see
     EIOCSETW below).  The maximum priority parameter
     en_maxpriority indicates the highest filter priority which
     may be set for the file (see EIOCSETF below).  The en_addr
     field is no longer maintained by the driver; see EIOCDEVP
     below.

     The read timeout parameter en_rtout specifies the number of
     clock ticks to wait before timing out on a read request and
     returning an EOF.  This parameter is initialized to zero by
     open(2), indicating no timeout. If it is negative, then read
     requests will return an EOF immediately if there are no
     packets in the input queue.  (Note that all parameters
     except for the read timeout are read-only and are ignored
     when changed.)

     A different ioctl is used to get device parameters of the
     ethernet underlying the minor device.  It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCDEVP, param)

     where param is defined in <sys/enet.h> as:

          struct endevp {
                 u_char   end_dev_type;
                 u_char   end_addr_len;
                 u_short  end_hdr_len;
                 u_short  end_MTU;
                 u_char   end_addr[EN_MAX_ADDR_LEN];
                 u_char   end_broadaddr[EN_MAX_ADDR_LEN];
          };

     The fields are:

     end_dev_type   Specifies the device type; currently one of
                    ENDT_3MB, ENDT_BS3MB or ENDT_10MB.

     end_addr_len   Specifies the address length in bytes (e.g.,



Printed 9/6/86           8 October 1985                         3






ENET(4)             UNIX Programmer's Manual              ENET(4)



                    1 or 6).

     end_hdr_len    Specifies the total header length in bytes
                    (e.g., 4 or 14).

     end_MTU        Specifies the maximum packet size, including
                    header, in bytes.

     end_addr       The address of this interface; aligned so
                    that the low order byte of the address is the
                    first byte in the array.

     end_broadaddr  The hardware destination address for broad-
                    casts on this network.

     The next two calls enable and disable the input packet sig-
     nal mechanism for the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, signp)
          u_int *signp;

     where signp is a pointer to a word containing the number of
     the signal to be sent when an input packet arrives and with
     the applicable codes being:

     EIOCENBS
          Enable the specified signal when an input packet is
          received for this file.  If the ENHOLDSIG flag (see
          EIOCMBIS below) is not set, further signals are
          automatically disabled whenever a signal is sent to
          prevent nesting and hence must be specifically re-
          enabled after processing.  When a signal number of 0 is
          supplied, this call is equivalent to EIOCINHS.

     EIOCINHS
          Disable any signal when an input packet is received for
          this file (the signp parameter is ignored).  This is
          the default when the file is first opened.

     The next two calls set and clear ``mode bits'' for the for
     the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, bits)
          u_short *bits;

     where bits is a short work bit-mask specifying which bits to
     set or clear.  Currently, the only bit mask recognized is
     ENHOLDSIG, which (if clear) means that the driver should



Printed 9/6/86           8 October 1985                         4






ENET(4)             UNIX Programmer's Manual              ENET(4)



     disable the effect of EIOCENBS once it has delivered a sig-
     nal.  Setting this bit means that you need use EIOCENBS only
     once.  (For historical reasons, the default is that ENHOLD-
     SIG is set.) The applicable codes are:

     EIOCMBIS
          Sets the specified mode bits

     EIOCMBIC
          Clears the specified mode bits

     Another ioctl call is used to set the maximum size of the
     packet input queue for an open enet file.  It is of the
     form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCSETW, maxwaitingp)
          u_int *maxwaitingp;

     where maxwaitingp is a pointer to a word containing the
     input queue size to be set.  If this is greater than maximum
     allowable size (see EIOCGETP above), it is set to the max-
     imum, and if it is zero, it is set to a default value.

     Another ioctl call flushes the queue of incoming packets.
     It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCFLUSH, 0)

     The final ioctl call is used to set the packet filter for an
     open enet file.  It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCSETF, filter)
          struct enfilter *filter

     where enfilter is defined in <sys/enet.h> as:

          struct enfilter
          {
                 u_char   enf_Priority;
                 u_char   enf_FilterLen;
                 u_short  enf_Filter[ENMAXFILTERS];
          };

     A packet filter consists of a priority, the filter command
     list length (in shortwords), and the filter command list
     itself.  Each filter command list specifies a sequence of



Printed 9/6/86           8 October 1985                         5






ENET(4)             UNIX Programmer's Manual              ENET(4)



     actions which operate on an internal stack.  Each shortword
     of the command list specifies an action from the set {
     ENF_PUSHLIT, ENF_PUSHZERO, ENF_PUSHWORD+N } which respec-
     tively push the next shortword of the command list, zero, or
     shortword N of the incoming packet on the stack, and a
     binary operator from the set { ENF_EQ, ENF_NEQ, ENF_LT,
     ENF_LE, ENF_GT, ENF_GE, ENF_AND, ENF_OR, ENF_XOR } which
     then operates on the top two elements of the stack and
     replaces them with its result.  When both an action and
     operator are specified in the same shortword, the action is
     performed followed by the operation.

     The binary operator can also be from the set { ENF_COR,
     ENF_CAND, ENF_CNOR, ENF_CNAND }.  These are ``short-
     circuit'' operators, in that they terminate the execution of
     the filter immediately if the condition they are checking
     for is found, and continue otherwise.  All pop two elements
     from the stack and compare them for equality; ENF_CAND
     returns false if the result is false; ENF_COR returns true
     if the result is true; ENF_CNAND returns true if the result
     is false; ENF_CNOR returns false if the result is true.
     Unlike the other binary operators, these four do not leave a
     result on the stack, even if they continue.

     The short-circuit operators should be used when possible, to
     reduce the amount of time spent evaluating filters.  When
     they are used, you should also arrange the order of the
     tests so that the filter will succeed or fail as soon as
     possible; for example, checking the Socket field of a Pup
     packet is more likely to indicate failure than the packet
     type field.

     The special action ENF_NOPUSH and the special operator
     ENF_NOP can be used to only perform the binary operation or
     to only push a value on the stack.  Since both are (con-
     veniently) defined to be zero, indicating only an action
     actually specifies the action followed by ENF_NOP, and indi-
     cating only an operation actually specifies ENF_NOPUSH fol-
     lowed by the operation.

     After executing the filter command list, a non-zero value
     (true) left on top of the stack (or an empty stack) causes
     the incoming packet to be accepted for the corresponding
     enet file and a zero value (false) causes the packet to be
     passed through the next packet filter.  (If the filter exits
     as the result of a short-circuit operator, the top-of-stack
     value is ignored.) Specifying an undefined operation or
     action in the command list or performing an illegal opera-
     tion or action (such as pushing a shortword offset past the
     end of the packet or executing a binary operator with fewer
     than two shortwords on the stack) causes a filter to reject
     the packet.



Printed 9/6/86           8 October 1985                         6






ENET(4)             UNIX Programmer's Manual              ENET(4)



     In an attempt to deal with the problem of overlapping and/or
     conflicting packet filters, the filters for each open enet
     file are ordered by the driver according to their priority
     (lowest priority is 0, highest is 255).  When processing
     incoming ethernet packets, filters are applied according to
     their priority (from highest to lowest) and for identical
     priority values according to their relative ``busyness''
     (the filter that has previously matched the most packets is
     checked first) until one or more filters accept the packet
     or all filters reject it and it is discarded.

     Filters at a priority of 2 or higher are called "high prior-
     ity" filters.  Once a packet is delivered to one of these
     "high priority" enet files, no further filters are examined,
     i.e. the packet is delivered only to the first enet file
     with a "high priority" filter which accepts the packet.  A
     packet may be delivered to more than one filter with a
     priority below 2; this might be useful, for example, in
     building replicated programs.  However, the use of low-
     priority filters imposes an additional cost on the system,
     as these filters each must be checked against all packets
     not accepted by a high-priority filter.

     The packet filter for an enet file is initialized with
     length 0 at priority 0 by open(2), and hence by default
     accepts all packets which no "high priority" filter is
     interested in.

     Priorities should be assigned so that, in general, the more
     packets a filter is expected to match, the higher its prior-
     ity.  This will prevent a lot of needless checking of pack-
     ets against filters that aren't likely to match them.

FILTER EXAMPLES
     The following filter would accept all incoming Pup packets
     on a 3mb ethernet with Pup types in the range 1-0100:

     struct enfilter f =
     {
         10, 19,                                 /* priority and length */
         ENF_PUSHWORD+1, ENF_PUSHLIT, 2,
                 ENF_EQ,                         /* packet type == PUP */
         ENF_PUSHWORD+3, ENF_PUSHLIT,
                 0xFF00, ENF_AND,                /* mask high byte */
         ENF_PUSHZERO, ENF_GT,                   /* PupType > 0 */
         ENF_PUSHWORD+3, ENF_PUSHLIT,
                 0xFF00, ENF_AND,                /* mask high byte */
         ENF_PUSHLIT, 0100, ENF_LE,              /* PupType <= 0100 */
         ENF_AND,                                /* 0 < PupType <= 0100 */
         ENF_AND                                 /* && packet type == PUP */
     };




Printed 9/6/86           8 October 1985                         7






ENET(4)             UNIX Programmer's Manual              ENET(4)



     Note that shortwords, such as the packet type field, are
     byte-swapped and so the literals you compare them to must be
     byte-swapped. Also, although for this example the word
     offsets are constants, code that must run with either 3mb or
     10mb ethernets must use offsets that depend on the device
     type.

     By taking advantage of the ability to specify both an action
     and operation in each word of the command list, the filter
     could be abbreviated to:

     struct enfilter f =
     {
         10, 14,                                     /* priority and length */
         ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_EQ, 2,    /* packet type == PUP */
         ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
                 0xFF00,                             /* mask high byte */
         ENF_PUSHZERO | ENF_GT,                      /* PupType > 0 */
         ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
                 0xFF00,                             /* mask high byte */
         ENF_PUSHLIT | ENF_LE, 0100,                 /* PupType <= 0100 */
         ENF_AND,                                    /* 0 < PupType <= 0100 */
         ENF_AND                                     /* && packet type == PUP */
     };

     A different example shows the use of "short-circuit" opera-
     tors to create a more efficient filter.  This one accepts
     Pup packets (on a 3Mbit ethernet) with a Socket field of
     12345.  Note that we check the Socket field before the
     packet type field, since in most packets the Socket is not
     likely to match.

     struct enfilter f =
     {
         10, 9,                                      /* priority and length */
         ENF_PUSHWORD+7, ENF_PUSHLIT | ENF_CAND,
                 0,                                  /* High word of socket */
         ENF_PUSHWORD+8, ENF_PUSHLIT | ENF_CAND,
                 12345,                              /* Low word of socket */
         ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_CAND,
                 2                                   /* packet type == Pup */
     };

SEE ALSO
     de(4), ec(4), en(4), il(4), enstat(8)

FILES
     /dev/enet{,a,b,c,...}0

BUGS
     The current implementation can only filter on words within
     the first "mbuf" of the packet; this is around 100 bytes (or



Printed 9/6/86           8 October 1985                         8






ENET(4)             UNIX Programmer's Manual              ENET(4)



     50 words).

     Because packets are streams of bytes, yet the filters
     operate on short words, and standard network byte order is
     usually opposite from Vax byte order, the relational opera-
     tors ENF_LT, ENF_LE, ENF_GT, and ENF_GE are not all that
     useful.  Fortunately, they were not often used when the
     packets were treated as streams of shorts, so this is prob-
     ably not a severe problem.  If this becomes a severe prob-
     lem, a byte-swapping operator could be added.

     Many of the "features" of this driver are there for histori-
     cal reasons; the manual page could be a lot cleaner if these
     were left out.

HISTORY
     8-Oct-85  Jeffrey Mogul at Stanford University
          Revised to describe 4.3BSD version of driver.

     18-Oct-84  Jeffrey Mogul at Stanford University
          Added short-circuit operators, changed discussion of
          priorities to reflect new arrangement.

     18-Jan-84  Jeffrey Mogul at Stanford University
          Updated for 4.2BSD (device-independent) version,
          including documentation of all non-kernel ioctls.

     17-Nov-81  Mike Accetta (mja) at Carnegie-Mellon University
          Added mention of <sys/types.h> to include examples.

     29-Sep-81  Mike Accetta (mja) at Carnegie-Mellon University
          Changed to describe new EIOCSETW and EIOCFLUSH ioctl
          calls and the new multiple packet queuing features.

     12-Nov-80  Mike Accetta (mja) at Carnegie-Mellon University
          Added description of signal mechanism for input pack-
          ets.

     07-Mar-80  Mike Accetta (mja) at Carnegie-Mellon University
          Created.















Printed 9/6/86           8 October 1985                         9




-----------[000075][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 21:00:01 EST
From:      LYNCH@A.ISI.EDU (Dan Lynch)
To:        mod.protocols.tcp-ip
Subject:   Re: Arpanet outage


Ken (and the others who have jumped into this),
Wow.  I guess this surfaced an issue that many of us had
taken for granted -- that those who are responsible for 
deploying the Arpanet and Milnet (and who know what else) have
been keeping the "diversity of routing" high enough to ensure
"reliability/survivability" of data links during even normal
times.  (There wil always be a farmer in Illinois who digs before
asking.)  Anyway,  here's hoping we can benefit from this recent
minor debacle.  
One additional query of those in the know:  when the service was
restored did things just start to work again or did some manual
intervention get packets routing on their merry way?

Dan

PS.  I really like to have these "system level" discussions whenever
we are "lucky" enough to have serious disruptions of the underlying
technology.  Thye are rare events and we think we have designed
our methods to deal with them.  And we rarely have the guts to
blast ourselves out of the water to "test" them.
-------

-----------[000076][next][prev][last][first]----------------------------------------------------
Date:      16 Dec 1986 21:00:01 EST
From:      Dan Lynch <LYNCH@A.ISI.EDU>
To:        Ken Pogran <pogran@CCQ.BBN.COM>
Cc:        Dan Lynch <LYNCH@A.ISI.EDU>, Andrew Malis <malis@CCS.BBN.COM>, hedrick@TOPAZ.RUTGERS.EDU, tcp-ip@SRI-NIC.ARPA
Subject:   Re: Arpanet outage

Ken (and the others who have jumped into this),
Wow.  I guess this surfaced an issue that many of us had
taken for granted -- that those who are responsible for 
deploying the Arpanet and Milnet (and who know what else) have
been keeping the "diversity of routing" high enough to ensure
"reliability/survivability" of data links during even normal
times.  (There wil always be a farmer in Illinois who digs before
asking.)  Anyway,  here's hoping we can benefit from this recent
minor debacle.  
One additional query of those in the know:  when the service was
restored did things just start to work again or did some manual
intervention get packets routing on their merry way?

Dan

PS.  I really like to have these "system level" discussions whenever
we are "lucky" enough to have serious disruptions of the underlying
technology.  Thye are rare events and we think we have designed
our methods to deal with them.  And we rarely have the guts to
blast ourselves out of the water to "test" them.
-------
-----------[000077][next][prev][last][first]----------------------------------------------------
Date:      Tue, 16-Dec-86 23:58:00 EST
From:      lantz@GREGORIO.STANFORD.EDU (Keith Lantz)
To:        mod.protocols.tcp-ip
Subject:   Re:  Protcol Development on SUN 2 and 3 computers.

Folks might also be interested to know that protocol development in
Berkeley UNIX has been rather easy for years at CMU and Stanford, who
jointly developed what is referred to as the "packet filter".  A paper
on the packet filter, by Jeff Mogul, Mike Accetta, and Rick Rashid was
just presented at the Conference on Practical Software Development
Environments.  Perhaps the first thing to know is that it provides for
application-level protocol development, rather than kernel hacking.
For example, that's how our ``UNIX server'' for the V-System is
implemented.

We have been beating on Berkeley for several years to include same with
the BSD distributions, with little success.  Rumor has it that it IS
included in the 4.3 distribution, but as unsupported software.  I am
not offering to support it myself, but if you're sufficiently
interested and vocal enough, who knows who might respond...

Keith

Following is the man page for the 4.3 version of the packet filter.
The 4.2 version differs somewhat.  




ENET(4)             UNIX Programmer's Manual              ENET(4)



NAME
     enet - ethernet packet filter

SYNOPSIS
     pseudo-device enetfilter 64

DESCRIPTION
     The packet filter provides a raw interface to Ethernets and
     similar network data link layers.  Packets received that are
     not used by the kernel (i.e., to support IP, ARP, and on
     some systems XNS, protocols) are available through this
     mechanism.  The packet filter appears as a set of character
     special files, one per hardware interface.  Each enet file
     may be opened multiple times, allowing each interface to be
     used by many processes.  The total number of open ethernet
     files is limited to the value given in the kernel configura-
     tion; the example given in the SYNOPSIS above sets the limit
     to 64.

     The minor device numbers are associated with interfaces when
     the system is booted.  Minor device 0 is associated with the
     first Ethernet interface ``attached'', minor device 1 with
     the second, and so forth.  (These character special files
     are, for historical reasons, given the names /dev/enet0,
     /dev/eneta0, /dev/enetb0, etc.)

     Associated with each open instance of an enet file is a
     user-settable packet filter which is used to deliver incom-
     ing ethernet packets to the appropriate process.  Whenever a
     packet is received from the net, successive packet filters
     from the list of filters for all open enet files are applied
     to the packet.  When a filter accepts the packet, it is
     placed on the packet input queue of the associated file.  If
     no filters accept the packet, it is discarded.  The format
     of a packet filter is described below.

     Reads from these files return the next packet from a queue
     of packets that have matched the filter.  If insufficient
     buffer space to store the entire packet is specified in the
     read, the packet will be truncated and the trailing contents
     lost.  Writes to these devices transmit packets on the net-
     work, with each write generating exactly one packet.

     The packet filter currently supports a variety of different
     ``Ethernet'' data-link levels:

     3mb Ethernet   packets consist of 4 or more bytes with the
                    first byte specifying the source ethernet
                    address, the second byte specifying the des-
                    tination ethernet  address, and the next two
                    bytes specifying the packet type.  (Actually,
                    on the network the source and destination



Printed 9/6/86           8 October 1985                         1






ENET(4)             UNIX Programmer's Manual              ENET(4)



                    addresses are in the opposite order.)

     byte-swapping 3mb Ethernet
                    packets consist of 4 or more bytes with the
                    first byte specifying the source ethernet
                    address, the second byte specifying the des-
                    tination ethernet address, and the next two
                    bytes specifying the packet type.  Each short
                    word (pair of bytes) is swapped from the net-
                    work byte order; this device type is only
                    provided as a concession to backwards-
                    compatibility.

     10mb Ethernet  packets consist of 14 or more bytes with the
                    first six bytes specifying the destination
                    ethernet address, the next six bytes the
                    source ethernet address, and the next two
                    bytes specifying the packet type.

     The remaining words are interpreted according to the packet
     type.  Note that 16-bit and 32-bit quantities may have to be
     byteswapped (and possible short-swapped) to be intelligible
     on a Vax.

     The packet filter mechanism does not know anything about the
     data portion of the packets it sends and receives.  The user
     must supply the headers for transmitted packets (although
     the system makes sure that the source address is correct)
     and the headers of received packets are delivered to the
     user.  The packet filters treat the entire packet, including
     headers, as uninterpreted data.

IOCTL CALLS
     In addition to FIONREAD, ten special ioctl calls may be
     applied to an open enet file.  The first two set and fetch
     parameters for the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, param)
          struct eniocb *param;

     where param is defined in <sys/enet.h> as:

          struct eniocb
          {
                 u_char  en_addr;
                 u_char  en_maxfilters;
                 u_char  en_maxwaiting;
                 u_char  en_maxpriority;
                 long    en_rtout;
          };



Printed 9/6/86           8 October 1985                         2






ENET(4)             UNIX Programmer's Manual              ENET(4)



     with the applicable codes being:

     EIOCGETP
          Fetch the parameters for this file.

     EIOCSETP
          Set the parameters for this file.

     The maximum filter length parameter en_maxfilters indicates
     the maximum possible packet filter command list length (see
     EIOCSETF below).  The maximum input wait queue size parame-
     ter en_maxwaitingindicates the maximum number of packets
     which may be queued for an ethernet file at one time (see
     EIOCSETW below).  The maximum priority parameter
     en_maxpriority indicates the highest filter priority which
     may be set for the file (see EIOCSETF below).  The en_addr
     field is no longer maintained by the driver; see EIOCDEVP
     below.

     The read timeout parameter en_rtout specifies the number of
     clock ticks to wait before timing out on a read request and
     returning an EOF.  This parameter is initialized to zero by
     open(2), indicating no timeout. If it is negative, then read
     requests will return an EOF immediately if there are no
     packets in the input queue.  (Note that all parameters
     except for the read timeout are read-only and are ignored
     when changed.)

     A different ioctl is used to get device parameters of the
     ethernet underlying the minor device.  It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCDEVP, param)

     where param is defined in <sys/enet.h> as:

          struct endevp {
                 u_char   end_dev_type;
                 u_char   end_addr_len;
                 u_short  end_hdr_len;
                 u_short  end_MTU;
                 u_char   end_addr[EN_MAX_ADDR_LEN];
                 u_char   end_broadaddr[EN_MAX_ADDR_LEN];
          };

     The fields are:

     end_dev_type   Specifies the device type; currently one of
                    ENDT_3MB, ENDT_BS3MB or ENDT_10MB.

     end_addr_len   Specifies the address length in bytes (e.g.,



Printed 9/6/86           8 October 1985                         3






ENET(4)             UNIX Programmer's Manual              ENET(4)



                    1 or 6).

     end_hdr_len    Specifies the total header length in bytes
                    (e.g., 4 or 14).

     end_MTU        Specifies the maximum packet size, including
                    header, in bytes.

     end_addr       The address of this interface; aligned so
                    that the low order byte of the address is the
                    first byte in the array.

     end_broadaddr  The hardware destination address for broad-
                    casts on this network.

     The next two calls enable and disable the input packet sig-
     nal mechanism for the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, signp)
          u_int *signp;

     where signp is a pointer to a word containing the number of
     the signal to be sent when an input packet arrives and with
     the applicable codes being:

     EIOCENBS
          Enable the specified signal when an input packet is
          received for this file.  If the ENHOLDSIG flag (see
          EIOCMBIS below) is not set, further signals are
          automatically disabled whenever a signal is sent to
          prevent nesting and hence must be specifically re-
          enabled after processing.  When a signal number of 0 is
          supplied, this call is equivalent to EIOCINHS.

     EIOCINHS
          Disable any signal when an input packet is received for
          this file (the signp parameter is ignored).  This is
          the default when the file is first opened.

     The next two calls set and clear ``mode bits'' for the for
     the file and are of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, code, bits)
          u_short *bits;

     where bits is a short work bit-mask specifying which bits to
     set or clear.  Currently, the only bit mask recognized is
     ENHOLDSIG, which (if clear) means that the driver should



Printed 9/6/86           8 October 1985                         4






ENET(4)             UNIX Programmer's Manual              ENET(4)



     disable the effect of EIOCENBS once it has delivered a sig-
     nal.  Setting this bit means that you need use EIOCENBS only
     once.  (For historical reasons, the default is that ENHOLD-
     SIG is set.) The applicable codes are:

     EIOCMBIS
          Sets the specified mode bits

     EIOCMBIC
          Clears the specified mode bits

     Another ioctl call is used to set the maximum size of the
     packet input queue for an open enet file.  It is of the
     form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCSETW, maxwaitingp)
          u_int *maxwaitingp;

     where maxwaitingp is a pointer to a word containing the
     input queue size to be set.  If this is greater than maximum
     allowable size (see EIOCGETP above), it is set to the max-
     imum, and if it is zero, it is set to a default value.

     Another ioctl call flushes the queue of incoming packets.
     It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCFLUSH, 0)

     The final ioctl call is used to set the packet filter for an
     open enet file.  It is of the form:

          #include <sys/types.h>
          #include <sys/enet.h>
          ioctl(fildes, EIOCSETF, filter)
          struct enfilter *filter

     where enfilter is defined in <sys/enet.h> as:

          struct enfilter
          {
                 u_char   enf_Priority;
                 u_char   enf_FilterLen;
                 u_short  enf_Filter[ENMAXFILTERS];
          };

     A packet filter consists of a priority, the filter command
     list length (in shortwords), and the filter command list
     itself.  Each filter command list specifies a sequence of



Printed 9/6/86           8 October 1985                         5






ENET(4)             UNIX Programmer's Manual              ENET(4)



     actions which operate on an internal stack.  Each shortword
     of the command list specifies an action from the set {
     ENF_PUSHLIT, ENF_PUSHZERO, ENF_PUSHWORD+N } which respec-
     tively push the next shortword of the command list, zero, or
     shortword N of the incoming packet on the stack, and a
     binary operator from the set { ENF_EQ, ENF_NEQ, ENF_LT,
     ENF_LE, ENF_GT, ENF_GE, ENF_AND, ENF_OR, ENF_XOR } which
     then operates on the top two elements of the stack and
     replaces them with its result.  When both an action and
     operator are specified in the same shortword, the action is
     performed followed by the operation.

     The binary operator can also be from the set { ENF_COR,
     ENF_CAND, ENF_CNOR, ENF_CNAND }.  These are ``short-
     circuit'' operators, in that they terminate the execution of
     the filter immediately if the condition they are checking
     for is found, and continue otherwise.  All pop two elements
     from the stack and compare them for equality; ENF_CAND
     returns false if the result is false; ENF_COR returns true
     if the result is true; ENF_CNAND returns true if the result
     is false; ENF_CNOR returns false if the result is true.
     Unlike the other binary operators, these four do not leave a
     result on the stack, even if they continue.

     The short-circuit operators should be used when possible, to
     reduce the amount of time spent evaluating filters.  When
     they are used, you should also arrange the order of the
     tests so that the filter will succeed or fail as soon as
     possible; for example, checking the Socket field of a Pup
     packet is more likely to indicate failure than the packet
     type field.

     The special action ENF_NOPUSH and the special operator
     ENF_NOP can be used to only perform the binary operation or
     to only push a value on the stack.  Since both are (con-
     veniently) defined to be zero, indicating only an action
     actually specifies the action followed by ENF_NOP, and indi-
     cating only an operation actually specifies ENF_NOPUSH fol-
     lowed by the operation.

     After executing the filter command list, a non-zero value
     (true) left on top of the stack (or an empty stack) causes
     the incoming packet to be accepted for the corresponding
     enet file and a zero value (false) causes the packet to be
     passed through the next packet filter.  (If the filter exits
     as the result of a short-circuit operator, the top-of-stack
     value is ignored.) Specifying an undefined operation or
     action in the command list or performing an illegal opera-
     tion or action (such as pushing a shortword offset past the
     end of the packet or executing a binary operator with fewer
     than two shortwords on the stack) causes a filter to reject
     the packet.



Printed 9/6/86           8 October 1985                         6






ENET(4)             UNIX Programmer's Manual              ENET(4)



     In an attempt to deal with the problem of overlapping and/or
     conflicting packet filters, the filters for each open enet
     file are ordered by the driver according to their priority
     (lowest priority is 0, highest is 255).  When processing
     incoming ethernet packets, filters are applied according to
     their priority (from highest to lowest) and for identical
     priority values according to their relative ``busyness''
     (the filter that has previously matched the most packets is
     checked first) until one or more filters accept the packet
     or all filters reject it and it is discarded.

     Filters at a priority of 2 or higher are called "high prior-
     ity" filters.  Once a packet is delivered to one of these
     "high priority" enet files, no further filters are examined,
     i.e. the packet is delivered only to the first enet file
     with a "high priority" filter which accepts the packet.  A
     packet may be delivered to more than one filter with a
     priority below 2; this might be useful, for example, in
     building replicated programs.  However, the use of low-
     priority filters imposes an additional cost on the system,
     as these filters each must be checked against all packets
     not accepted by a high-priority filter.

     The packet filter for an enet file is initialized with
     length 0 at priority 0 by open(2), and hence by default
     accepts all packets which no "high priority" filter is
     interested in.

     Priorities should be assigned so that, in general, the more
     packets a filter is expected to match, the higher its prior-
     ity.  This will prevent a lot of needless checking of pack-
     ets against filters that aren't likely to match them.

FILTER EXAMPLES
     The following filter would accept all incoming Pup packets
     on a 3mb ethernet with Pup types in the range 1-0100:

     struct enfilter f =
     {
         10, 19,                                 /* priority and length */
         ENF_PUSHWORD+1, ENF_PUSHLIT, 2,
                 ENF_EQ,                         /* packet type == PUP */
         ENF_PUSHWORD+3, ENF_PUSHLIT,
                 0xFF00, ENF_AND,                /* mask high byte */
         ENF_PUSHZERO, ENF_GT,                   /* PupType > 0 */
         ENF_PUSHWORD+3, ENF_PUSHLIT,
                 0xFF00, ENF_AND,                /* mask high byte */
         ENF_PUSHLIT, 0100, ENF_LE,              /* PupType <= 0100 */
         ENF_AND,                                /* 0 < PupType <= 0100 */
         ENF_AND                                 /* && packet type == PUP */
     };




Printed 9/6/86           8 October 1985                         7






ENET(4)             UNIX Programmer's Manual              ENET(4)



     Note that shortwords, such as the packet type field, are
     byte-swapped and so the literals you compare them to must be
     byte-swapped. Also, although for this example the word
     offsets are constants, code that must run with either 3mb or
     10mb ethernets must use offsets that depend on the device
     type.

     By taking advantage of the ability to specify both an action
     and operation in each word of the command list, the filter
     could be abbreviated to:

     struct enfilter f =
     {
         10, 14,                                     /* priority and length */
         ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_EQ, 2,    /* packet type == PUP */
         ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
                 0xFF00,                             /* mask high byte */
         ENF_PUSHZERO | ENF_GT,                      /* PupType > 0 */
         ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
                 0xFF00,                             /* mask high byte */
         ENF_PUSHLIT | ENF_LE, 0100,                 /* PupType <= 0100 */
         ENF_AND,                                    /* 0 < PupType <= 0100 */
         ENF_AND                                     /* && packet type == PUP */
     };

     A different example shows the use of "short-circuit" opera-
     tors to create a more efficient filter.  This one accepts
     Pup packets (on a 3Mbit ethernet) with a Socket field of
     12345.  Note that we check the Socket field before the
     packet type field, since in most packets the Socket is not
     likely to match.

     struct enfilter f =
     {
         10, 9,                                      /* priority and length */
         ENF_PUSHWORD+7, ENF_PUSHLIT | ENF_CAND,
                 0,                                  /* High word of socket */
         ENF_PUSHWORD+8, ENF_PUSHLIT | ENF_CAND,
                 12345,                              /* Low word of socket */
         ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_CAND,
                 2                                   /* packet type == Pup */
     };

SEE ALSO
     de(4), ec(4), en(4), il(4), enstat(8)

FILES
     /dev/enet{,a,b,c,...}0

BUGS
     The current implementation can only filter on words within
     the first "mbuf" of the packet; this is around 100 bytes (or



Printed 9/6/86           8 October 1985                         8






ENET(4)             UNIX Programmer's Manual              ENET(4)



     50 words).

     Because packets are streams of bytes, yet the filters
     operate on short words, and standard network byte order is
     usually opposite from Vax byte order, the relational opera-
     tors ENF_LT, ENF_LE, ENF_GT, and ENF_GE are not all that
     useful.  Fortunately, they were not often used when the
     packets were treated as streams of shorts, so this is prob-
     ably not a severe problem.  If this becomes a severe prob-
     lem, a byte-swapping operator could be added.

     Many of the "features" of this driver are there for histori-
     cal reasons; the manual page could be a lot cleaner if these
     were left out.

HISTORY
     8-Oct-85  Jeffrey Mogul at Stanford University
          Revised to describe 4.3BSD version of driver.

     18-Oct-84  Jeffrey Mogul at Stanford University
          Added short-circuit operators, changed discussion of
          priorities to reflect new arrangement.

     18-Jan-84  Jeffrey Mogul at Stanford University
          Updated for 4.2BSD (device-independent) version,
          including documentation of all non-kernel ioctls.

     17-Nov-81  Mike Accetta (mja) at Carnegie-Mellon University
          Added mention of <sys/types.h> to include examples.

     29-Sep-81  Mike Accetta (mja) at Carnegie-Mellon University
          Changed to describe new EIOCSETW and EIOCFLUSH ioctl
          calls and the new multiple packet queuing features.

     12-Nov-80  Mike Accetta (mja) at Carnegie-Mellon University
          Added description of signal mechanism for input pack-
          ets.

     07-Mar-80  Mike Accetta (mja) at Carnegie-Mellon University
          Created.















Printed 9/6/86           8 October 1985                         9

-----------[000078][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17-Dec-86 09:34:12 EST
From:      malis@CCS.BBN.COM (Andrew Malis)
To:        mod.protocols.tcp-ip
Subject:   Re: Arpanet outage

Dan,

To answer your question: when service was restored, the PSNs
automatically brought the trunks back up and reconnected the
network together.  No manual intervention required.

Andy

P.S. Here's another good topic to rant and rave about:

Don't you hate it when hosts keep messages sitting in queues
for days, and conversations get out of sync?  Take, for example,
this message we all just received this morning:

Received: from SRI-NIC.ARPA by CCS.BBN.COM ; 17 Dec 86 08:48:38 EST
Received: from vax.darpa.mil by SRI-NIC.ARPA with TCP; Wed 17 Dec 86 00:03:07-PST
Received: by vax.darpa.mil (4.12/4.7)
	id AA19852; Mon, 15 Dec 86 05:42:39 est
Date: Mon 15 Dec 86 05:42:30-EST
From: Dennis G. Perry <PERRY@VAX.DARPA.MIL>
Subject: Re: Arpanet outage

It took about 42 hours for vax.darpa.mil to send it to
SRI-NIC.ARPA, and another 9 hours to make it to me.

-----------[000079][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17 Dec 86  9:34:12 EST
From:      Andrew Malis <malis@ccs.bbn.com>
To:        Dan Lynch <LYNCH@a.isi.edu>
Cc:        Ken Pogran <pogran@ccq.bbn.com>, Andrew Malis <malis@ccs.bbn.com>, hedrick@topaz.rutgers.edu, tcp-ip@sri-nic.arpa
Subject:   Re: Arpanet outage
Dan,

To answer your question: when service was restored, the PSNs
automatically brought the trunks back up and reconnected the
network together.  No manual intervention required.

Andy

P.S. Here's another good topic to rant and rave about:

Don't you hate it when hosts keep messages sitting in queues
for days, and conversations get out of sync?  Take, for example,
this message we all just received this morning:

Received: from SRI-NIC.ARPA by CCS.BBN.COM ; 17 Dec 86 08:48:38 EST
Received: from vax.darpa.mil by SRI-NIC.ARPA with TCP; Wed 17 Dec 86 00:03:07-PST
Received: by vax.darpa.mil (4.12/4.7)
	id AA19852; Mon, 15 Dec 86 05:42:39 est
Date: Mon 15 Dec 86 05:42:30-EST
From: Dennis G. Perry <PERRY@VAX.DARPA.MIL>
Subject: Re: Arpanet outage

It took about 42 hours for vax.darpa.mil to send it to
SRI-NIC.ARPA, and another 9 hours to make it to me.
-----------[000080][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17-Dec-86 10:08:58 EST
From:      cam@ACC-SB-UNIX.ARPA (Chris Markle)
To:        mod.protocols.tcp-ip
Subject:   Maintaining Statistics for TCP/IP Implementations


On the last day of the TCP/IP Implementor's Workshop held in August 86,
a gentleman from BBN spoke about network monitoring protocols. In this
discussion was mentioned a "list" of statistics that TCP/IP
implementations could maintain for internal use by the implementation
or for query by a network monitor device via some sort of network
monitoring protocol.  The BBN gent was asked if he would post this
list of statistics on this mailing list; he seemed to imply that he
would.

Did this information get posted and I missed it? If so, does anyone
know what msg number it was or what date it was posted? If not, would 
the folks at BBN be interested in posting it?

Also, if anyone else has notions on what sort of statistics would be
worthwhile for a TCP/IP (etc.) implementation to maintain, please send
mail to me directly and I will summarize the responses in a later 
posting to this mailing list.

Thanks in advance for any help in this matter.

Chris Markle (cam@acc-sb-unix) (301-290-8100)

-----------[000081][next][prev][last][first]----------------------------------------------------
Date:      17 Dec 1986 1419-PST (Wednesday)
From:      Keith Lantz <lantz@gregorio.stanford.edu>
To:        tcp-ip@sri-nic.ARPA
Cc:        
Subject:   Stanford/CMU packet filter
It turns out that the paper I referred to has not in fact been
published in the open literature.  Although I thought I saw a reference
to the effect that it was "to be presented", it must have read
"submitted to".  In any event, I'm sure the previously cited authors
would be delighted to honor any requests for the document.

Keith
-----------[000082][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17-Dec-86 15:05:58 EST
From:      Okuno@SUMEX-AIM.STANFORD.EDU (Hiroshi "Gitchang" Okuno)
To:        mod.protocols.tcp-ip
Subject:   Need information on NFS

I am posting this request for my colleague.

He would like to know the activity on Network File System, which is a
very important application like Telnet, FTP, SMTP and Finger.  He
knows that Sun proposes their own NFS system and that it is a de facto
standard.  Any information on NFS (common or proper noun) is welcome.

Thanks in advance.

- Gitchang -
-------

-----------[000083][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17-Dec-86 16:36:22 EST
From:      bill@ucbvax.Berkeley.EDU@tifsie.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: tifsie!bill
From: bill@tifsie.UUCP (Bill Stoltz)
Newsgroups: mod.protocols.tcp-ip,comp.dcom.lans
Subject: VM Interface for TCP/IP
Keywords: VM, TCP/IP, network
Message-ID: <278@tifsie.UUCP>
Date: 17 Dec 86 21:39:07 GMT
Article-I.D.: tifsie.278
Posted: Wed Dec 17 15:39:07 1986
Organization: TI Process Automation Center, Dallas
Lines: 54
 
 
 
 
I just received my copy of "The IBM Software Catalog for VM Systems"
in the mail today.  The following is an excerpt from this catalog.
 
 
VM Interface Program for TCP/IP
-------------------------------
 
Program Number:		5798-DRG
 
Purpose:		This program offering provieds the VM/SP user with
			the ability to participate in a network using the
			TCP/IP transmission protocol.  This includes the
			ability to transfer files, send mail, and log on 
			remotely to a VM host.  The Program uses a 
			System/370 channel, attached to a front-end Series/1
			with EDX or RPS, or a Device Access Control Unit.
			Sample programs are provided for these system
			configuratuions to permit intrfaceing to local area
			networks including X.25, ProNET, or Ethernet.
 
Manuals:		Availability Notice, GB13-7712.
 
 
I was wondering if anyone has had any experience
with this product and what likes and dislikes they have with this
product.
 
I have not talked with my IBM salesman yet, so I don't have any other
information (cost, availability, etc). 
 
If you mail any reques to me I will be glad to send you any additional
information when I get it.  If there is enough demand I will post a summary
to the net.
 
Thanks for any help.
 
Bill
 
 
 
-----------------
 
Bill Stoltz
Texas Instruments
Process Automation Center
P.O. Box 655012, M/S 3635
Dallas, TX 75243
 
UUCP: 	{uiucdcs!convex!smu, {rice, sun!texsun}!ti-csl}!tifsie!bill
DECNET: TIFSIE::bill
Voice:	(214) 995-2786

-----------[000084][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17-Dec-86 17:19:00 EST
From:      lantz@GREGORIO.STANFORD.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Stanford/CMU packet filter

It turns out that the paper I referred to has not in fact been
published in the open literature.  Although I thought I saw a reference
to the effect that it was "to be presented", it must have read
"submitted to".  In any event, I'm sure the previously cited authors
would be delighted to honor any requests for the document.

Keith

-----------[000085][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17-Dec-86 21:50:59 EST
From:      LYNCH@A.ISI.EDU (Dan Lynch)
To:        mod.protocols.tcp-ip
Subject:   Re: Maintaining Statistics for TCP/IP Implementations

Chris,  It was Charles Lynn at BBN (CLynn@BBN.COM) who gave that network
statistics presentation.  He is leading the session on Network Management
in the March 87 TCP/IP Interoperability conference and like Diogenes is
stil looking for the truth...  I remember his presentation and was 
awestruck at the level of implementation detail he sought in order
to get a suitable baseline of statistics for monitoring the health of
the "network".  (All of the statistics he asked for do essentially
exist in any real implementation because of the retransmission 
requirements of TCP.)  

Dan
-------

-----------[000086][next][prev][last][first]----------------------------------------------------
Date:      17 Dec 1986 21:50:59 EST
From:      Dan Lynch <LYNCH@A.ISI.EDU>
To:        cam@ACC-SB-UNIX.ARPA (Chris Markle)
Cc:        tcp-ip@SRI-NIC.ARPA, LYNCH@A.ISI.EDU
Subject:   Re: Maintaining Statistics for TCP/IP Implementations
Chris,  It was Charles Lynn at BBN (CLynn@BBN.COM) who gave that network
statistics presentation.  He is leading the session on Network Management
in the March 87 TCP/IP Interoperability conference and like Diogenes is
stil looking for the truth...  I remember his presentation and was 
awestruck at the level of implementation detail he sought in order
to get a suitable baseline of statistics for monitoring the health of
the "network".  (All of the statistics he asked for do essentially
exist in any real implementation because of the retransmission 
requirements of TCP.)  

Dan
-------
-----------[000087][next][prev][last][first]----------------------------------------------------
Date:      Wed, 17-Dec-86 23:41:56 EST
From:      pdb@SEI.CMU.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Help using ICMP under Ultrix ?

Can anyone give me some pointers about how to access ICMP under Ultrix/4.2BSD ?
I'm trying to write a program to listen for ICMP messages and gather statistics
about them.  I have been able to get Echo Reply messages, but I haven't been
able to get Echo Request, Host/Network unreachable, or routing redirects.

--Pat.

-----------[000088][next][prev][last][first]----------------------------------------------------
Date:      Thu, 18-Dec-86 10:25:00 EST
From:      sned@PEGASUS.SCRC.SYMBOLICS.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Was: Protocol Development on SUN 2 and 3 computers.

From an equally biased source with a different viewpoint....

    Date: Tue, 16 Dec 86 10:01:55 PST
    From: melohn@Sun.COM (Bill Melohn)

    Speaking as an admittedly biased source, the Sun Datacomm group has
    managed to implement OSI (MAP/TOP) with a minimum of kernel changes
    using the basic protosw, ioctl facility, and even the routing table
    routines from the standard SunOS. We also have implementations of X.25,
    SNA, and DECnet which all use the ifnet structure to layer different
    protocol instances on top of different datalinks (HDLC, SDLC, 802.x).
    As you might expect, the farther your protocol is from the TCP/IP
    model, the less useful the standard networking code will be.

    If your goal is XNS under Unix, it makes sense to use the 4.3
    implementation.
I can't really disagree with that.
			As a general platform for protocol/network
    development, we believe SunOS offers most of the facilities you
    need.

If your goal is general protocol/network development or cross-system
integration, the Symbolics Lisp Machine's "Generic Network System" is
probably 10x more powerful.  Use UN*X to develop UN*X software,
particularly when there are no deep design issues involved.  Use
something far better if there are any hard or unresolved issues to be
solved.  I could easily justify my 10x claim, but this conversation
wasn't about that.

Steve Sneddon
Manager of Networks and Communications
Symbolics, Inc.

-----------[000089][next][prev][last][first]----------------------------------------------------
Date:      Thu, 18-Dec-86 14:12:59 EST
From:      braden@ISI.EDU (Bob Braden)
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

Gitchang,

The problem of Internet standard(s) for network file systems has been
receiving some attention, but probably less than it deserves.  Within the
formal Internet R&D structure, the issue falls within the scope of the
End-to-End Protocols task force, which has been considering what steps
need to be taken.

Sun's NFS is a "defacto standard" (I dislike that term, which appears to
be internally contradictory) for Unix systems.   Internet protocols 
must be designed to handle the entire spectrum of operating systems in
the world, not just Unix, and considerable work will be needed on NFS
to generalize it outside the Unix world.  It is unclear at this time
whether that generalization will result in anything useful to either 
Unix or any other systems.  In principle, there is a collaboration
between Sun and the End-to-End Protocols Taskforce to pursue this
question, but in practice little progress has been made.

If there were a set of people who could say, "we have some knowledge 
and/or experience in the network file system area, and we want to
devote some effort to the definition of an Internet standard
network file system", things would happen a lot faster (with or
without SUN's active participation).

Bob Braden
 

-----------[000090][next][prev][last][first]----------------------------------------------------
Date:      Thu, 18-Dec-86 17:56:39 EST
From:      robert@SPAM.ISTC.SRI.COM (Robert Allen)
To:        mod.protocols.tcp-ip
Subject:   Please put me on the tcp-ip mailing list.


	Robert Allen,

	robert@sri-spam.ARPA	OR
	robert@spam.istc.sri.com

-----------[000091][next][prev][last][first]----------------------------------------------------
Date:      Thu, 18-Dec-86 19:06:34 EST
From:      rick@SEISMO.CSS.GOV.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

Sun's NFS is NOT a good Unix Networked Filesystem. They broke
some Unix semantics in the name of generality to the non-unix world.
Sun claims it to be a non-unix specific design.

What do you see as the major problems? "Considerable work" doesn't sound
right.

It seems to run with MS-DOS and VMS as far as I know. So, it's not
too Unix-specific.

---rick

-----------[000092][next][prev][last][first]----------------------------------------------------
Date:      Thu, 18-Dec-86 20:03:02 EST
From:      ron@BRL.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Help using ICMP under Ultrix ?

In order to play around with ICMP messages on 4.2/Ultrix you need
to add fixed up RAW_ICMP protocol code and then the user code is
a fairly simple matter.  The code is in 4.3 or if you want you can
get new drivers for 4.2 from BRL-VGR via anonymous FTP and load the
file arch/ping.shar or arch/ping.tar

-Ron

-----------[000093][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 00:12:38 EST
From:      mike@BRL.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Arpanet outage

Since nobody from DCA has spoken up yet, I'll add a few comments.
As the MILNET is being rebuilt using the "new" IMP packaging
(with link encryption capability), some (most?) of the data
circuits are being moved to DCTN (?Defense Computer Telecomunications
Network?), which is an ISDN-oriented base of somewhat switchable
circuit capabilities.  I believe DCTN has a phased implementation
plan, probably with automatic switching happening much later.

My general impression is that DCA and Army Signal Corps (now the
"Information Systems Command") both tend to do a good to excellent
job implementing systems designed around traditional concepts
such as point-to-point circuits, so DCTN is likely to be
a big win.  In addition, I suspect that routing of DCTN circuits
is likely to be carefully controlled to prevent excessive
bundling onto single transmission links, precisely for survivability.
(Blind faith here).

What we have seen of DCTN so far is a T1 line terminating in our
Post's Central Office at a D4 channel-bank, with a bunch (7?)
of 56k DDS links from there to the location of the MILNET IMP.
This gives much better signal quality than previous arrangements
where the DDS lines traveled over 5 miles of wire to the town CO.
It does not provide any additional reliability, as everything still
travels over the big black cable from our CO to the town CO.
This cable is especially attractive to heavy earthmoving equipment,
and is neutralized several times each year.  Presumably when the
T1 gets to the town CO, it terminates in something resembling
a circuit switch or patch pannel or something (behind another D4
channel bank, of course), so that some alternate routing capability
exists at that point.  Of course, it might be that the T1 gets
zipped through a bunch of repeaters to some regional circuit
switch, extending our line of vulnerability a good long way.

Personally, I find the concept of layering a packet switching network
on top of a switchable circuit network rather amusing, but
quite realistic and practical.

More grist for the Rumor Mill, may it grind long and fine...
	Best,
	 -MIKE

-----------[000094][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 04:44:13 EST
From:      MKL@SRI-NIC.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS comments

NFS is claimed to be a general network file system, but it really isn't.
As someone who is trying to implement an NFS server for a non-UNIX
system, I've got lots of problems.  Here are a few:

As far as I'm concerned, NFS has no security or authentication.  
If you want security you must specify exactly which hosts can
mount your filesystems and you must trust EVERY single user
on those hosts, since they can tell the server that they are
whoever they want to be.  This isn't really a problem with the
file protocol and may be considered a seperate issue,
but I wouldn't use a file protocol without security.

NFS is claimed to be idempotent, but it really isn't.  One example:
If you do a file rename and the request is retransmitted, you may
get back a success indication if the first request was received,
or you'll get back an error if it was the retransmission.

There are some fields that are very UNIX specific.  A userid field is
used to indicate user names for things like file authors.  This userid
is a number and it is assumed that there is a GLOBAL /etc/passwd file
so you can translate numbers to names.  This is completely bogus.  A
userid should be a string, not a number.  More could be said about the
groupid field.

NFS uses very large UDP packets to achieve acceptable performance.
This may indicate that the protocol is what really needs to be fixed.

There is no attempt at any ASCII conversion between normal systems
and UNIX.  This of course is the famous CRLF to newline problem which
makes sharing of text files between different systems almost useless.
Yes, you can write a program to do the conversions, but that ruins
the entire idea of file access since you must then do an entire file transfer.
Besides that, sharing binary files between different operating systems
is almost useless anyways.

From a document that lists the design goals of NFS, it appears that it
was only intended as a way to provide ACCESS to remote files.
It was not and is not a protocol to allow SHARING of the data
in those files between non-homogeneous systems.  For that reason
it is really quite useless as a way to share files between
different operating systems (and probably explains why the CRLF/newline
problem was left out).  It is too bad that they defined
a common data representation (XDR) to build the RPC protocol
with, but then left it out when dealing with file representation.

With that stated, I can probably say that NFS is a good protocol for
sharing files between homogeneous (UNIX-like) systems based on
non-homogeneous file servers.  This doesn't seem like a very
interesting or useful design goal though, and I still don't know
why I'm bothering to implement it.
-------

-----------[000095][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 10:27:45 EST
From:      Rudy.Nedved@H.CS.CMU.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Protcol Development on SUN 2 and 3 computers.

The ENET filter mechanism is nice but CMU is using the 
BSD sockets mechanism. We still have a few applications
using the old mechanism but under a compatibility flag
and plan to flush the stuff.

>From the perspective of a network application hacker, the
only major thing that is missing from the BSD mechanism
is the ability to filter certain types of raw packets. It
should not be neccessary for one to modify the operating
system in order to see all packets of a certain type, or
length or from a certain host or going to a certain host.

The minor problems we have been ignoring. I don't like the
fact that the operating system code design seems to believe
it knows when data should be flushed for an SMTP connection
since it knows how to do it for a telnet and ftp data
connection. Heck, I like to get my performance anywhere I
can and dang...computers should be smart for the novice but
should not take control away from the expert...I hate waiting
for kernel bug fixes when a little more application level
control could create a work-around for the application.

-Rudy

-----------[000096][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 13:07:45 EST
From:      schoff@csv.rpi.edu.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  NFS comments


	
	NFS is claimed to be a general network file system, but it really isn't.
	As someone who is trying to implement an NFS server for a non-UNIX
	system, I've got lots of problems.  Here are a few:
	
	There are some fields that are very UNIX specific.  A userid field is
	used to indicate user names for things like file authors.  This userid
	is a number and it is assumed that there is a GLOBAL /etc/passwd file
	so you can translate numbers to names.  This is completely bogus.  A
	userid should be a string, not a number.  More could be said about the
	groupid field.

I'd just like to comment on this aspect and let others comment on the
rest.  Back in 1982 when the new tacacs (TAC access) was being worked
there was some discussion on the "network id" (which broadly is what
NFS's ID is).  Independantly of ANYTHING that SUN was tooling up for
it was determined that the "network id" would in fact be a number.  The
last I heard that was still the plan (and implementation).

Marty Schoffstall

-----------[000097][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 13:30:44 EST
From:      braden@ISI.EDU (Bob Braden)
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS


	
	Sun's NFS is NOT a good Unix Networked Filesystem. They broke
	some Unix semantics in the name of generality to the non-unix world.
	Sun claims it to be a non-unix specific design.
	
Are we to read "not good" as "bad"?  If not, what do you mean by this
complaint?  If so, why should we standardize a protocol which is bad for
an important class of hosts?
	
	What do you see as the major problems? "Considerable work" doesn't sound
	right.

The problem most people have cited is NFS' authentication/permission
model, which is not only Unix-oriented but also perhaps inadequate.  This
is a hard and important issue.  In fact, it has been pointed out that the
assumption of globally-unique uid's and gid's is invalid in many sites even
among Unix systems.

Another problem is in the remote mount protocol.  SUN treats it as separate,
yet it seems that its functions ought to be included in any network file
system standard.

Another set of issues has to do with convincing ourselves that the
NFS primitives have sufficient generality to provide useful service with
the other file systems in the world besides Unix.  That probably means 
generalizing the existing primitives and adding a few more.  It also
means providing defined hooks for extensibility.

Finally, there is the issue of underlying layers.  NFS assumes two other
protocols, XDR and RPC.  It seems desirable to define NFS independently
of the lower layers, so different choices could be made in the future
for these protocols (after all, that is what layering is really for).
RPC, in particular, is highly doubtful in its present form as an Internet
standard, as its transport-protocol mechanism seems deficient.
	
	It seems to run with MS-DOS and VMS as far as I know. So, it's not
	too Unix-specific.
	
More information about the generality and completeness of these implementations
would be interesting and useful.  Could I do a remote mount from my SUN to
our VMS machine, for example, and access any VMS file?  Can the VMS machine
get at any Unix file (subject to permissions)?  How do permissions work?

Finally, I don't know how much time you have spent on protocol committees,
but every one of the existing Internet protocols represents several manyears
(or more) of concentrated effort, spread out over 2-5 years.  
	
Bob Braden
	

-----------[000098][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 13:32:03 EST
From:      hedrick@TOPAZ.RUTGERS.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

NFS has dealt with one set of machine dependencies:  The RPC mechanism
is well-defined, and works across machines.  So you can presumably
read directories and delete files.  But once you want to get or put
data, it assumes a Unix file model, in the sense that the file is
assumed to be flat (no way to say get the Nth record, or retrieve
based on a record key).  Furthermore, no translation of data is
defined.  This shows up in the PC implementation.  Unix uses LF as
a line terminator.  MS-DOS uses eitehr CR or CR-LF [I don't remember
which].  So you can use your Unix directory to store you PC files,
but if you then go to edit them from Unix, the line terminators will
be odd.  There is of course a utility to change formats.  For many
purposes PC-NFS is just fine.  It lets you use Unix disks to augment
your PC's disks, and the formats aren't different enough to cause
real trouble.  But you'd like to see a real machine-independent
file system solve that problem.  Unfortunately, it isn't clear to
me how one would do it.  That's presumably why by and large it isn't
being done.  You can't have the server just change line terminators,
for several reasons:
  - binary files (e.g. executables) would likely get munged
  - you can't tell which files are text and which are binary
  - if you turn LF into CRLF, you change the number of bytes, and
	so random access by byte number isn't going to work

I think NFS is useful across a reasonable set of operating systems.
I'm glad Sun put the work they did into making it as
machine-independent as they did.  But I certainly don't think one
could claim it to be perfect.  

By the way, several notes have talked about NFS "violating Unix
semantics".  The most common example is that file locking doesn't work
across the network.  It does now, in Sun release 3.2.  I think it's
unfair to compare the first release of NFS, which we used in
production for 1.5 years across 2 different manufacturers' machines
(Sun and Pyramid), with System V release 3's network file system,
which still isn't very widely available.  (The other major omission is
that you can't use devices across the network.  Just disk files.  I'd
certainly like to see that fixed, but I can't say that in practice it
causes much of a problem.  I'll be interested to see whether that is
fixed in NFS by the time we have sVr3 in operation.)

-----------[000099][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 13:55:44 EST
From:      rick@SEISMO.CSS.GOV.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS


	By the way, several notes have talked about NFS "violating Unix
	semantics".  The most common example is that file locking doesn't work
	across the network.  It does now, in Sun release 3.2. 

This is only partially true. The system V lockf() IS supported by
a lock daemon. However it does NOT support the 4.2bsd flock() file locking.

Now since everything we do here is 4.2bsd compatible, not system 5, I
maintain that file locking still doesn't work. We bought a 4.2bsd compatible
system from Sun. We don't care what System 5 features they add.

It still doesn't do forced apppend writes, nor permit accessing
devices. That is something a good UNIX nfs would do. Those are clear
violations of Unix semantics and not fixed (nor planned to be fixed as
I understand it)

---rick

-----------[000100][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 15:47:01 EST
From:      grr@seismo.CSS.GOV@cbmvax.UUCP (George Robbins)
To:        mod.protocols.tcp-ip
Subject:   Heath WWV Clock now on Sale

I stopped by a local Heathkit outlet, and noticed that the Heath GC1000
WWV clock is on sale in the Holiday catalog.

The price is $249.95 for the assembled version with the RS232 interface
included, or $199.95 for the kit and $49.95 for the RS232 accessory.

When I asked the salesman how long the price was good for, he said probably
until the clock was closed out.  He didn't know for sure, or whether threre
would be a follow up product.

-----------[000101][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19 Dec 86 21:23:40 pst
From:      John B. Nagle <jbn@glacier.stanford.edu>
To:        mod.protocols.tcp-ip
Subject:   Re: Maintaining Statistics for TCP/IP Implementations
     Much of what I learned about congestion in the Internet I learned by
instrumenting a TCP implementation.  The information that you need is
not necessarily the information that a typical implementation keeps.
Yet as it turns out, collecting this information is quite inexpensive.
Management of the exceptional cases is the crucial issue.

     During the life of a TCP connection, it is useful to maintain some
event counts, and at the conclusion of the connection, it is useful to
generate a log entry of some form, at least for connections that meet
some criteria.  

     When a packet is received, there are several possibilities as to
its disposition.  The most useful (not, unfortunately, always the most
common) case is that it contains new and acceptable data, an ACK that
acknowledges previously unacknowledged data, or a window update that
advances the window.  This case must of course be handled efficiently.
Packets which change the state of the connection are also useful, but
efficiency is less of an issue.  But packets which do none of these
things are redundant; they represent an error somewhere in the system.
It is immensely useful to count the useful packets over the life of a
connection.  My criterion was that if less than 95% of the packets
received over the life of a connection were useful, (allowing for at
least 5 non-useful packets on short sessions to handle startup issues),
then a log entry should be generated to indicate trouble. 

     Reading such a log is an edifying experience.  The most notable fact
about such a log is that certain machines are represented all out of 
proportion to the amount of traffic they generate.  One of course logs
the identities of the hosts involved in the connections.  A log entry here
corresponds to "dropping a trouble ticket" in a telephone central office;
it indicates something to be fixed.  Enough said.

     One also wants to keep a tally of retransmission attempts; again, if
the number of retransmitted packets is large over the life of the connection,
something is wrong and this should be noted.  Of course, if a connection
closes abnormally, one logs that fact for later analysis.

     It is also useful to log rejected packets.  Find all those places
in your TCP where you decide to drop a packet because it is "bad", and
make them calls to a routine that logs the packet with an error code.
One turns up all sorts of dirty laundry that way.

     The number of ICMP Source Quenches received is also quite useful;
again, large values compared to the volume of data traffic are significant.

     When I operated a VAX with such logging two years ago, there would
be five or six connections logged as bad when the network was operating
properly; there might be hundreds when something was wrong.  That's how
I managed to make a large network based on slow links work properly.

     It is worth thinking about how one might report such data in a standard
way to a network monitor node.  Something that generated one datagram per
"bad" TCP connection might be quite useful; some would of course get lost
but serialization would allow the network monitor to detect this, and
statistical techniques could be used to compensate for the lost data.
You do need to log a measure of the total data transmitted in each
direction on the connection, and log entries should also contain cumulative
information about the total amount of data and total number of connections
so that statistical computations can be made.

     One needs this information to manage a network.  With it, one can 
manage your network, and make it perform well.  Without it, one can just 
grumble and make excuses.

				John Nagle
-----------[000102][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 18:44:46 EST
From:      mrose@NRTC-GREMLIN.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Was: Protocol Development on SUN 2 and 3 computers.


    [ My appologies to everyone for continuing this flame, but I could
    not let it go past untouched... ]

>	 If your goal is general protocol/network development or
>	 cross-system integration, the Symbolics Lisp Machine's "Generic
>	 Network System" is probably 10x more powerful.  Use UN*X to
>	 develop UN*X software, particularly when there are no deep
>	 design issues involved.  Use something far better if there are
>	 any hard or unresolved issues to be solved.  I could easily
>	 justify my 10x claim, but this conversation wasn't about that.

    Use Lisp environments to develop Lisp environment software.  Don't
    preach to others about which system is more powerful.  While I
    admire the tight integration between the networking software and the
    rest of the lisp environment, I know a lot of people who wouldn't
    consider even touching a lisp environment to do protocol
    development.  It is a matter of personal taste and personal
    productivity.  There is no way you could justify your 10x claim in
    my environment (and yes, we do have people building distributed
    expert-systems on top of lisp environments here, so we have detailed
    experience in building things on top of the environment you
    mention).  

    While the original speaker may have been out of line in his claim
    that "we believe SunOS offers most of the facilities you need", your
    response was, owing to its language and posturing, even more out of
    line.  

    Please direct further outrages to me personally, rather than burden
    the list...  

Thanks,

/mtr

-----------[000103][next][prev][last][first]----------------------------------------------------
Date:      Fri, 19-Dec-86 21:43:15 EST
From:      bzs@BU-CS.BU.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS


It seems unfair to cast aspersions at those who have pioneered Network
File Systems as if their implementations were somehow finished or
immutable. Praise should be given to how far the publication of their
efforts has brought us in thinking about the issues (and the
credibility that they are worth thinking about.)

I can think of another major networking protocol which prides itself
on having been put into practice early in its design cycle and
corrected where need be (sometimes radically) based upon concrete use
rather than paper committee meetings. The name escapes me however.

Issues like "file organization" between heterogeneous systems have
been raised for years. I know of no protocol which attempts to solve
this in general (although a few special cases -do- go a long way.)

The last time someone raised this issue in my office I asked him if
this problem had been solved for magnetic tapes yet on his system?
If so, I proposed that I could adapt that solution to FTP (the case
in point at the time) easily enough. Needless to say he walked away
in a huff.

Some of these issues are HARD, very hard! I wouldn't go so far as to
say insoluble (I mean, people do seem to solve them manually) but I
think this difficulty should be considered before saying that XYZ does
not solve this. Proposals for solutions would be most welcome.

I think the problem is that people either want perfect and general
solutions or they throw up their hands entirely. My suspicion is that
the best solution will be the ability to code modules at an
application level to handle the various permutations of file access
methods between systems and let the libraries blossom out of the user
community. Extensibility seems to be the key need here. And practice.

As a more concrete example, why shouldn't FTP allow me to specify
input and output filter programs on both ends, provided as a library
by the systems? The same sort of thing should work for Network File
Systems, although the ability to type files and have these "daemons"
invoked automatically would probably be the right approach. Given a
few years of that I suspect the "standards" would begin to reveal
themselves.

	-Barry Shein, Boston University

-----------[000104][next][prev][last][first]----------------------------------------------------
Date:      Sat, 20-Dec-86 00:23:40 EST
From:      jbn@GLACIER.STANFORD.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Maintaining Statistics for TCP/IP Implementations


     Much of what I learned about congestion in the Internet I learned by
instrumenting a TCP implementation.  The information that you need is
not necessarily the information that a typical implementation keeps.
Yet as it turns out, collecting this information is quite inexpensive.
Management of the exceptional cases is the crucial issue.

     During the life of a TCP connection, it is useful to maintain some
event counts, and at the conclusion of the connection, it is useful to
generate a log entry of some form, at least for connections that meet
some criteria.  

     When a packet is received, there are several possibilities as to
its disposition.  The most useful (not, unfortunately, always the most
common) case is that it contains new and acceptable data, an ACK that
acknowledges previously unacknowledged data, or a window update that
advances the window.  This case must of course be handled efficiently.
Packets which change the state of the connection are also useful, but
efficiency is less of an issue.  But packets which do none of these
things are redundant; they represent an error somewhere in the system.
It is immensely useful to count the useful packets over the life of a
connection.  My criterion was that if less than 95% of the packets
received over the life of a connection were useful, (allowing for at
least 5 non-useful packets on short sessions to handle startup issues),
then a log entry should be generated to indicate trouble. 

     Reading such a log is an edifying experience.  The most notable fact
about such a log is that certain machines are represented all out of 
proportion to the amount of traffic they generate.  One of course logs
the identities of the hosts involved in the connections.  A log entry here
corresponds to "dropping a trouble ticket" in a telephone central office;
it indicates something to be fixed.  Enough said.

     One also wants to keep a tally of retransmission attempts; again, if
the number of retransmitted packets is large over the life of the connection,
something is wrong and this should be noted.  Of course, if a connection
closes abnormally, one logs that fact for later analysis.

     It is also useful to log rejected packets.  Find all those places
in your TCP where you decide to drop a packet because it is "bad", and
make them calls to a routine that logs the packet with an error code.
One turns up all sorts of dirty laundry that way.

     The number of ICMP Source Quenches received is also quite useful;
again, large values compared to the volume of data traffic are significant.

     When I operated a VAX with such logging two years ago, there would
be five or six connections logged as bad when the network was operating
properly; there might be hundreds when something was wrong.  That's how
I managed to make a large network based on slow links work properly.

     It is worth thinking about how one might report such data in a standard
way to a network monitor node.  Something that generated one datagram per
"bad" TCP connection might be quite useful; some would of course get lost
but serialization would allow the network monitor to detect this, and
statistical techniques could be used to compensate for the lost data.
You do need to log a measure of the total data transmitted in each
direction on the connection, and log entries should also contain cumulative
information about the total amount of data and total number of connections
so that statistical computations can be made.

     One needs this information to manage a network.  With it, one can 
manage your network, and make it perform well.  Without it, one can just 
grumble and make excuses.

				John Nagle

-----------[000105][next][prev][last][first]----------------------------------------------------
Date:      Sat, 20-Dec-86 02:57:26 EST
From:      tim@lll-crg.ARPA@hoptoad.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

I really question whether RPC can be considered OS independent.  It is way
too big for microcomputers.  An acquaintance involved with NFS (who has
never worked for Sun) gave me an estimate recently that a client side alone
would take 100K of machine code on a Mac.

I have enormous respect for Sun as both a hardware and a software developer.
But the fact is that it just doesn't make sense to develop a supposedly
machine-independent and OS-independent network file system on a single-OS
network with very powerful machines.  This has left them with an overly
elaborate protocol which is not suited to microcomputers.  PC-NFS is a
client side only, and a hypothetical Mac NFS would be the same way.  This
gives even less functionality than good old FTP; you can't even transfer a
file from a PC to a PC with it!  Client-only implementations fall far short
of the goals of the system.

To be truly OS-independent and machine-independent, NFS would have to be
redesigned from the ground up, and simulatenously developed on more than one
machine and OS.
-- 
Tim Maroney, Electronic Village Idiot
{ihnp4,sun,well,ptsfa,lll-crg,frog}!hoptoad!tim (uucp)
hoptoad!tim@lll-crg (arpa)

-----------[000106][next][prev][last][first]----------------------------------------------------
Date:      20 Dec 1986 03:39-EST
From:      CERF@A.ISI.EDU
To:        mike@BRL.ARPA
Cc:        Tcp-Ip@SRI-NIC.ARPA, pogran@CCQ.BBN.COM malis@CCS.BBN.COM, hedrick@TOPAZ.RUTGERS.EDU
Subject:   Re:  Arpanet outage
Mike,

First, doing packet switching on top of circuit switching is actually
not a bad idea and I know of at least one packet vendor whose
switches internally have a built-in circuit switch to allow the
pt-pt topology of the network to be altered under the control of the
packet switches(!).

About the DCTN, if it is ISDN-like, won't it be entirely digital?
If so, I assume the only reason for the channel bank is to go from
D to A to the analog telephone and maybe on-post analog (?) PBX?

Doesn't the post have alternative, digital radio (?) links to, say,
the townCO or to the long-lines termination, in addition to the
buried cable? It doesn't seem prudent to put all the channels
on the buried cable...

Vint
-----------[000107][next][prev][last][first]----------------------------------------------------
Date:      Sat, 20-Dec-86 04:48:57 EST
From:      mrose@NRTC-GREMLIN.ARPA (Marshall Rose)
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments

In the ISO world, you might consider doing FTAM instead.  I think it meets
all of your objections, with the notable exception that it's going to be
a while before someone writes an ftam that really performs well.  You can
get concurrency, committment and recovery (CCR) with FTAM for all the usual
updating and locking type of problems.  Also, owing to its size, you probably
would need two protocols on a diskless workstation: a small netload protocol
(MAP has one), and then FTAM proper.  For those of you interested in looking
at FTAM, I suggest you get a copy of parts 1 and 2 of the FTAM draft
international standard:

	ISO DIS 8571/1
	File Transfer, Access and Management (FTAM) Part 1: General Description

	ISO DIS 8571/2
	File Transfer, Access and Management (FTAM) Part 2: Virtual Filestore

As mentioned in one of the RFCs (I can't remember which), you can purchase
these from Omnicom, 703/281-1135.  Part 1 will cost you $28, part 2 will cost
you $36.

/mtr

-----------[000108][next][prev][last][first]----------------------------------------------------
Date:      Sat 20 Dec 86 10:58:38-PST
From:      David L. Kashtan <KASHTAN@SRI-IU.ARPA>
To:        braden@VENERA.ISI.EDU
Cc:        tcp-ip@SRI-NIC.ARPA
Subject:   Re:  Need information on NFS
	>More information about the generality and completeness of these implementations
	>would be interesting and useful.  Could I do a remote mount from my SUN to
	>our VMS machine, for example, and access any VMS file?  Can the VMS machine
	>get at any Unix file (subject to permissions)?  How do permissions work?

I am the person who did the VMS NFS implementation, so I think I am
reasonably qualified to comment on NFS as it relates to non-homogeneous
O/S environments:
  The VMS NFS implementation is a server-only NFS implementation.
  It uses the SUN User-Level UNIX NFS implementation and the 4.3BSD-
  based Eunice (in order to provide the necessary UNIX file-system
  semantics).  Without Eunice this would have been a very major
  undertaking.  I would most likely have had to re-implement a pretty
  good sized chunk of the Eunice file handling system in order to get
  NFS to work on VMS.  So, in reality, the way to get an NFS up on VMS
  is to get VMS to pretend that it is UNIX.  This is hardly something one
  would be happy about in a standard for non-homogeneous O/S environments.
  Global UIDs are dealt with by having an "nfs_passwd" file on the VMS
  machine which contains the standard UNIX passwd data for the Global
  UNIX world.  Incoming UIDs are mapped to usernames using this information
  and then the usernames are used to get the Eunice (i.e. VMS local) UIDs.
  The reverse is used to generate outgoing UIDs.

  You can indeed do a remote mount of a VMS filesystem from your SUN
  workstation and access any VMS file.  The performance is about the same
  as a UNIX machine running the User-Level NFS server (it works but it is
  not adequate as a serious file server).  It is quite unrealistic to expect
  a client VMS NFS (the required file system semantics are just not there!).
  About the best you could do is a client VMS NFS for UNIX programs running
  under Eunice (which expect UNIX file system semantics).

  It is my feeling that the Lisp Machine NFILE (and its predecessor QFILE)
  remote file access protocols went much further in dealing with file access
  for MANY different types of operating systems and I am very disappointed
  that nobody even looked at them as examples when thinking about NFS.
David
-------
-----------[000109][next][prev][last][first]----------------------------------------------------
Date:      Sat, 20-Dec-86 11:25:21 EST
From:      bzs@BU-CS.BU.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS comments


>In the ISO world, you might consider doing FTAM instead.  I think it meets
>all of your objections, with the notable exception that it's going to be
>a while before someone writes an ftam that really performs well...
>For those of you interested in looking
>at FTAM, I suggest you get a copy of parts 1 and 2 of the FTAM draft
>international standard:

Are there any working FTAM implementations to look at, performant or
not?

	-Barry Shein, Boston University

-----------[000110][next][prev][last][first]----------------------------------------------------
Date:      Sat, 20-Dec-86 11:38:28 EST
From:      bzs@BU-CS.BU.EDU (Barry Shein)
To:        mod.protocols.tcp-ip
Subject:   Need information on NFS


>I really question whether RPC can be considered OS independent.  It is way
>too big for microcomputers.  An acquaintance involved with NFS (who has
>never worked for Sun) gave me an estimate recently that a client side alone
>would take 100K of machine code on a Mac.

Two years ago most higher level languages other than BASIC were too
big for micros. I think this problem will vanish shortly by itself.
100K? that doesn't sound very big. A $1K Atari/St has 1MB of memory.

	-Barry Shein, Boston University

-----------[000111][next][prev][last][first]----------------------------------------------------
Date:      Sat, 20-Dec-86 21:35:11 EST
From:      mrose@NRTC-GREMLIN.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments


    An excellent question.  As with most "international standards" you
    have to qualify which point in the life of you're talking about:  

	WD - working draft
	DP - draft proposal
	DIS - draft international standard
	IS - international standard

    There are, to my knowledge, no implementations of the FTAM DIS, as
    it was only recently released (August, 1986).  I expect this to
    change in about six months.

    There are however some implementations of the FTAM DP, I believe
    that DEC has one, that Concord Data Systems has one, and probably
    about five other MAP/TOP vendors (a few even claim to have it
    running on the PC).  I'll ask my local MAP/TOP guru here which
    implementations are available and I'll get back to you.  

    Organizations like the NBS and COS (Corporation or Open Systems) in
    the US and SPAG and ECMA in Europe have done a fine job of
    specifying the "agreed subset" of FTAM which should be implemented.
    This makes the harmonization problem (getting different vendors
    implementations to work together) much easier.  However, the FTAM
    DIS and the FTAM DP are NOT, repeat NOT, compatible (they even use
    different underlying services), so it's not clear how much of a DP
    implementation can be used when building a DIS implementation.

    To comment a bit on some related mail: if I remember the FTAM spec
    correctly, you can do things like forced-append writes and
    record-level access.  I don't think you'll see the first generation
    of FTAM DIS implementations do this, but these features are built
    into the FTAM.

/mtr

-----------[000112][next][prev][last][first]----------------------------------------------------
Date:      Sat, 20-Dec-86 21:43:28 EST
From:      hedrick@TOPAZ.RUTGERS.EDU (Charles Hedrick)
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments

Do you know how random access is done in FTAM?  The big problem it
seems to be is specifying locations in the file.  Unix does this by
byte number.  That can't work for ISAM files.  But if you do it by
record number, you are going to have to count records from the
beginning of the file to make that work on Unix.  So at first glance
it would seem that system-indepedent random-access is impossible
unless you force people to conform to one file model.  The folks at
Johns Hopkins hospital have a very large multi-vendor distributed
database system.  They decided to forget about network file systems
and did it directly on top of RPC.  It seems to have worked very well
for them.  The idea is that it isn't all that useful to do
cross-system random access anyway.  Let the database be managed by a
local process, and have people on other machines make somewhat
high-level requests via RPC.  They made RPC work on an incredible
variety of machines, including ones that only understood BASIC, and
only talked on serial lines.

If you restrict your network file system to sequential I/O, and if you
are willing to specify whether you want a file treated as text or
binary, then it is possible to do things across a variety of systems.
The Xerox D-machines implement a transparent network file system using
normal Internet FTP under these constraints.  NFS didn't do this
because there is no obvious way to tell in Unix whether a file is
binary or text.  There would seem to be basic design issues here, and
I am sceptical about claims that FTAM somehow gets around them.  If
you think of the network file system as something external, i.e. if
you don't store all your system files on it, but use it only for
things that the user knows are special, then of course all these
problems go away.  You can demand some clue as to whether the file is
binary or text, and you can impose restrictions on what operations are
allowed (e.g. no random access or locations specified only by a "magic
cookie").  But NFS was designed to allow you to completely replace
your disk drive with it, in which case such restrictions are not
acceptable.  I'm open to the possibilty that one needs two different
kinds of network file system, one which is completely transparent in
terms of the host system's semantics, and the other which makes
whatever compromises are needed to support every possible operating
system.  NFS is a compromise between these two, and like all
compromises runs the danger of satisfying no one.

Can anybody tell me how FTAM handles these issues?  I don't need the
whole standard, just a brief description of its file model and the
kinds of operations allowed.

-----------[000113][next][prev][last][first]----------------------------------------------------
Date:      Sun, 21-Dec-86 04:34:39 EST
From:      tim@lll-crg.ARPA@hoptoad.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Need information on NFS

Unfortunately, the current crop of 512K and 640K micros are not going to
magically evaporate when the technology improves.  IBM offers no upgrades,
and Apple's are priced too high for large nets to make them on a general
basis.  We still have VAXen even though you can buy a better machine for
under $30K these days, after all.

Also remember that the 100K estimate was for a client RPC only.  I don't
know how much an NFS client itself would add, nor an RPC server and NFS
server.  A complete NFS implementation would surely strain even a one
megabyte Mac+ or (hypothetical) MS/DOS version 5 machine.  For instance, the
Mac Programmer's Workshop (a Greenhills C compiler with a scaled-down but
still powerful UNIX subset) requires over 800K.  This means you couldn't
have NFS installed during program development.
-- 
Tim Maroney, Electronic Village Idiot
{ihnp4,sun,well,ptsfa,lll-crg,frog}!hoptoad!tim (uucp)
hoptoad!tim@lll-crg (arpa)

-----------[000114][next][prev][last][first]----------------------------------------------------
Date:      Sun, 21 Dec 86 06:07:12 EST
From:      schoff@csv.rpi.edu (Martin Schoffstall)
To:        hoptoad!tim@lll-crg.arpa
Cc:        tcp-ip@sri-nic.arpa
Subject:   Re:  Need information on NFS
	
>I really question whether RPC can be considered OS independent.  It is way
>too big for microcomputers.  An acquaintance involved with NFS (who has
>never worked for Sun) gave me an estimate recently that a client side alone
>would take 100K of machine code on a Mac.
	
Sorry, The department of Computer Science at RPI's initial version of
SUN RPC's (which included UDP/IP and a device driver) all fit in less
than 64k.  Besides its seems pretty hard to buy a machine with less
than 512k these days (except maybe a CoCo).

Martin Schoffstall

-----------[000115][next][prev][last][first]----------------------------------------------------
Date:      Sun, 21-Dec-86 14:11:57 EST
From:      weltyc%cieunix@CSV.RPI.EDU (Christopher A. Welty)
To:        mod.protocols.tcp-ip
Subject:   Need information on NFS


David:
	You say you did the VMS NFS (for TWG I assume).  Does this
work with 4.3 UNIX NFS?  (I think Mt. Xinu makes it, but if there are
others for 4.3 that it works for that's fine).  When I spoke to sales
at TWG they said they had no idea if it worked with anything but SUNs.

				-Chris
				 weltyc@csv.rpi.edu

-----------[000116][next][prev][last][first]----------------------------------------------------
Date:      Sun, 21-Dec-86 18:42:01 EST
From:      mrose@NRTC-GREMLIN.ARPA (Marshall Rose)
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments


    Well let me try to answer that.  I've only read the spec twice and
    don't have it here in from of me but here goes...

    FTAM is based on the notion of a "virtual filestore" (sound
    familiar, huh?)  The filestore consists of zero or more files, each
    with an unambiguous name.  Each file has a set of attributes (of
    which filename is one) and some contents.  The attributes have the
    usual permission stuff along with a description of the kind of
    information the file contains (e.g., iso646 ascii, octets, bits),
    and a description of the access discipline (my term) for the file.
    The contents is a binary tree.  Each node in the tree contains

	- node attributes:
		node name
		length of arc back to parent
		a flag saying whether data is attached to this node
	- attached data (if any)
	- a list of children

    Now, there are a couple of ways that you can implement a UNIX-style
    regular file.  The simplest is to have just the root node with the
    entire file contents as the attached data (as an octet string) and
    no children.  In this case, the access discipline is rather
    inconsequential, since you can only get at one data element at a
    time and there is only one to choose from.  

    Alternately, for a file like /etc/passwd, you might have a root node
    with no data, and a child for each line in the file.  The access
    discipline would allow you to specify any child element you want
    when you wanted to read or write.

    There are in the spec, several document types and access disciplines
    listed with pre-defined meanings.  Others can be chosen via
    "bi-lateral" agreement.  In the NBS OSI Implementor's Workshop
    Agreements, they have defined a new document type called "directory"
    in which the nodes are, you guessed it, file names.  Assuming you
    had an FTAM server which supported that document type, an FTAM
    client could do the DIR and NLST commands that we've all become so
    attached to.  

    So to answer your question:  FTAM imposes on everyone the same
    fairly general file model.  Each FTAM server consists of a protocol
    engine for FTAM and a localFS-virtualFS engine.  For UNIX-like
    systems, going between the two is rather restricted unless you want
    to put a lot of smarts in your code (at which true UNIX-ites would
    gasp, I'm sure people are reeling at my /etc/passwd example!).  In
    this case, the question of "is it ascii or is it octet-aligned or is
    it bit-aligned" is something the localFS-virtualFS engine for UNIX
    would have to answer.  Now of course, if you had something like
    DEC's RMS in your filesystem, FTAM makes more sense as there is a
    closer match between the local and virtual filestores.  

    It is important in all of this however, to remember what OSI is:  a
    method for tying together dissimilar systems.  This is done by
    choosing a sufficiently general model which is (hopefully) a
    superset of all existing systems, and then letting people code
    local2virtual engines at the network boundary.  

    With respect to RPC, there are such notions in OSI.  My favorite is
    called ROS (Remote Operations Service) which is a fairly simple
    invoke-result, invoke-error protocol with provisions to support
    "once-only" execution of operations.  FTAM is not meant as a
    competitor to ROS (and quite frankly, had *I* designed FTAM, I would
    have put FTAM on top of ROS), but is trying to solve a different
    problem, which perhaps has overlap for certain applications.  

/mtr

-----------[000117][next][prev][last][first]----------------------------------------------------
Date:      Sun, 21-Dec-86 20:57:12 EST
From:      tim@lll-crg.ARPA@hoptoad.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

So if a complete RPC and NFS can be fit into 64K, why is PC-NFS client only?
-- 
Tim Maroney, Electronic Village Idiot
{ihnp4,sun,well,ptsfa,lll-crg,frog}!hoptoad!tim (uucp)
hoptoad!tim@lll-crg (arpa)

-----------[000118][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 04:43:01 EST
From:      grr@seismo.CSS.GOV@cbmvax.UUCP (George Robbins)
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

In article <8612200757.AA28013@hoptoad.uucp> hoptoad!tim (Tim Maroney) writes:
>I really question whether RPC can be considered OS independent.  It is way
>too big for microcomputers.  An acquaintance involved with NFS (who has
>never worked for Sun) gave me an estimate recently that a client side alone
>-- 
>Tim Maroney, Electronic Village Idiot

Times change - there is already Alpha test version of NFS for the Amiga by
an outfit called Ameristar Technologies.  You can assume that in a year or
so, when the 1MB chips reach price parity, you're going to see a bunch of
4-16MB 'micro computers'...

-----------[000119][next][prev][last][first]----------------------------------------------------
Date:      22 Dec 1986 07:28-CST
From:      SNELSON@STL-HOST1.ARPA
To:        mike@BRL.ARPA
Cc:        Tcp-Ip@SRI-NIC.ARPA, pogran@CCQ.BBN.COM malis@CCS.BBN.COM, hedrick@TOPAZ.RUTGERS.EDU
Subject:   Re:  Arpanet outage
MIKE:
DCTN IS THE DEFENSE COMMERCIAL TELECOMMUNICATIONS NETWORK. IT'S FOR RUNNER
WAS THE COMMERCIAL SATELLITE COMMUNICATIONS NETWORK. IT WAS BASED UPON
AN INITIAL 11 NUMBER 5ESS TO CONSTRUCTED AT VARIOUS MILITARY INSTALLATIONS
IN THE US. LETTERKENNY ARMY DEPOT, REDSTONE ARSENAL, SCOTT AFB, ETC., WERE
SOME OF THE SITES. AUTOVON, FTS, DDN LINKS AND SO ON WERE/ARE TO USE IT
AS THEIR BACKBONE. DDN LINKS ARE TO USE LANDLINE BECAUSE OF THE PROPAGATION
DELAY PROBLEM. YOU SHOULD HAVE NOTICED A MARKED IMPROVEMENT ON SOME OF
YOUR AUTOVON CALLS STARTING SOMETIME IN APRIL WHERE THEY WERE ROUTED VIA
SATELLITE. THERE IS SOME KIND OF DELAY COMPENSATOR AND I CAN'T TELL IF
I AM ON A BIRD OR NOT. HAVEN'T TRIED A TERMINAL TO A TAC TO SEE IF IT
MAKES ANY DIFFERENCE OR NOT. IF I DO I WILL LET YOU KNOW.

STEVE

-----------[000120][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 08:02:59 EST
From:      schoff@CSV.RPI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

I didn't say rpc/nfs i said udp/ip/rpc/device-driver fit in 64k.

I think the reason for PC-NFS being client only is a matter of
marketing, and getting a product to market in a certain amount of
time.

marty

-----------[000121][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 10:04:13 EST
From:      dorl@seismo.CSS.GOV@uwmacc.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: uwmacc!dorl
From: dorl@uwmacc.UUCP (Michael Dorl)
Newsgroups: mod.protocols.tcp-ip,comp.dcom.lans
Subject: Ethernet loading tools
Message-ID: <756@uwmacc.UUCP>
Date: 22 Dec 86 15:04:12 GMT
Organization: UWisconsin-Madison Academic Comp Center
Lines: 7

I am looking for some tools to load an Ethernet.  I am interested in
software for use with BSD Unix or Wollongong WIN/VX that puts a heavy
load on a Ethernet.  I want to be able to put a known load on the network
and hopefully also gather some statistics such as collisions, late collisons,
byte counts, packet counts (good and bad), etc.  I'd be satisfied to find
client programs that generate traffic with specifed characteristics to use
with the ECHO and DISCARD servers.

-----------[000122][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 10:08:58 EST
From:      dorl@seismo.CSS.GOV@uwmacc.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: uwmacc!dorl
From: dorl@uwmacc.UUCP (Michael Dorl)
Newsgroups: mod.protocols.tcp-ip,comp.bugs.4bsd
Subject: Telnet - Local flow control
Message-ID: <757@uwmacc.UUCP>
Date: 22 Dec 86 15:08:57 GMT
Organization: UWisconsin-Madison Academic Comp Center
Lines: 2
Keywords: telnet, flow control

I am looking for patches to BSD 4.3 telnet that allows local flow control.
Ideally, telnet should allow the user to specify this as an option.

-----------[000123][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 10:53:28 EST
From:      aecheniq@VAX.BBN.COM (Andres Echenique)
To:        mod.protocols.tcp-ip
Subject:   TCP/IP implementations for XENIX/VENIX on a PC-AT



I am interested in learning about people's experiences with using either the
Wollongong TCP/IP software or the Network Research Corp's Fusion product for
TCP/IP on an IBM PC-AT.  Experiences with other networking software for the
AT/XENIX environment are also appreciated.  In particular, comments on these
product supporting ports of UNIX network applications to the AT/XENIX
environment would be appreciated.

Thanks.

--Andres.

-----------[000124][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22 Dec 86 10:53:28 EST
From:      Andres Echenique <aecheniq@VAX.BBN.COM>
To:        info-ibmpc@C.ISI.EDU, tcp-ip@SRI-NIC.ARPA
Subject:   TCP/IP implementations for XENIX/VENIX on a PC-AT


I am interested in learning about people's experiences with using either the
Wollongong TCP/IP software or the Network Research Corp's Fusion product for
TCP/IP on an IBM PC-AT.  Experiences with other networking software for the
AT/XENIX environment are also appreciated.  In particular, comments on these
product supporting ports of UNIX network applications to the AT/XENIX
environment would be appreciated.

Thanks.

--Andres.


-----------[000125][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 12:20:27 EST
From:      braden@ISI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments

Marshall,

How can I learn what the "agreed subset" of FTAM is??

Bob Braden

-----------[000126][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 12:39:23 EST
From:      ROMKEY@XX.LCS.MIT.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

> So if a complete RPC and NFS can be fit into 64K, why is PC-NFS client only?

Maybe SUN isn't interested in selling IBM PC's as file servers...

I suspect it would take over 64K (but under 64K code + 64K data) to do
both server and client NFS for the PC.
				- john

-------

-----------[000127][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 13:25:39 EST
From:      geof@decwrl.DEC.COM@apolling.UUCP (Geof Cooper)
To:        mod.protocols.tcp-ip
Subject:   Re: Re:  Need information on NFS


 >      So if a complete RPC and NFS can be fit into 64K, why is PC-NFS client only?
 >      -- 
 >      Tim Maroney, Electronic Village Idiot
 >      {ihnp4,sun,well,ptsfa,lll-crg,frog}!hoptoad!tim (uucp)
 >      hoptoad!tim@lll-crg (arpa)
 >      

-----------[000128][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 14:16:12 EST
From:      lars@ACC-SB-UNIX.ARPA (Lars Poulsen)
To:        mod.protocols.tcp-ip
Subject:   "Reverse subnetting"

One of our customers is trying to set up a TCP-IP network on
a private X.25 backbone. Their network includes both Ultrix
and SUN hosts. Apparently, the SUN implementation treats the
X.25 link as a collection of point-to-point links, and they
have talked our customer into configuring a class C network
number for each virtual circuit. Now they want to merge all
of these class C nets into one virtual class B network by
a sort of reverse subnetting. Has anybody ever done such a
thing (reverse "sub"netting) ?
				/ Lars Poulsen
				  Advanced Computer Communications

-----------[000129][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 14:28:07 EST
From:      geof@decwrl.DEC.COM@imagen.UUCP (Geof Cooper)
To:        mod.protocols.tcp-ip
Subject:   Remote file systems

I would venture to say that the problem of <<transmitting>> file semantics is
solvable, and even reasonably well understood.  It is not dissimilar
to the problem of transmitting abstract data types, which is well described
in Maurice Herlihy's Master's thesis ("Transforming Abstract Values
in Messages", S.M. thesis, MIT, 1980 -- there was a paper about it, too,
but I don't happen to have a reference to it).  The fundamental
idea is that you can solve the N^2 problem of translating file
(or terminal, or abstract) data types between N different machines either
by brute force or by standardizing on one "transmissible" data type
for purposes of transmission (e.g., ASCII, various FTP transfer modes,
"binary" file formats (such as Interpress, DDL, Impress), bigendian
number semantics, IEEE floating point format, TAR tapes, punch cards).

I'd like to tag the "real problem" as the "interface problem," at least
for the purposes of this discussion.

The "interface problem" is that the set of capabilities of the transmissible
data type may not be the same as the capabilities of a particular system.
For example, EBCDIC and ASCII don't necessarily overlap in all the codes
they define.  A more pertinent example is that Unix OPEN calls don't give
a way to specify that a file is textual, so applications don't generate
any information about what they are <trying> to do when the modify the
file system.  So it doesn't matter if you have a textual file "type"
in NFS, since UNIX doesn't give you a way to know that you're supposed
to be using it.

I've seen three generic attempts to solve this problem:

    [1] Modify all systems to use the transmissible type (ascii,
        IEEE floating point, ISO protocols, Interscript, virtually
        all standards).

    [2] Modify all systems to have functionality appropriate to the
        transmissible type and translate on the fly in each system
        (IBM machines sending to ASCII printers, graphics applications
        that change their capabilities to fit the printer (or page
        description language) [cf Macintosh original ROM's versus
        second version ROM's that gave characters fractional widths to
        cope with laserwriter]). 

    [3] Define a broad transmissible type, but not every system has to
        implement the whole thing.  Systems can intercommunicate where
        there is an overlap of supported options. (Telnet (esp SUPDUP),
        FTP, prob ISO-FTAM).

The advantage of [1] is that it works the best, but the problem is that
it disrupts the systems, and tends to inhibit technical progress (since
adding a new feature requires distributed consensus and implementing
it on all machines).  [2] still requires that applications change, but
it can be workable when the system in question already implements part
of the transmissible type.  For example, I believe that the ATT guys
have found it possible to add "manditory file locking" to UNIX for some
files.

Approach [3] is pretty common, and can achieve good but limited results
(e.g., you can use FTP between any two machines for textual files, assuming
they implemented FTP correctly).  Unfortunately, it is really the brute force
solution to the N^2 problem in disguise.  For example, how many machines
actually implement ALL the telnet options (How many implementors (or even
system architects) could list them all without looking)?

Usually, of course, a mixture of the three is involved.  For example,
a UNIX machine can easily know to receive a "textual type" file
correctly using [2], even if it doesn't know how to generate one.

All this is not to put a damper on the interesting discussion that is
going on about NFS.  Rather, it is my intent to try and raise the
level of that discussion to more general issues.

    - Are there other approaches to solving the interface problem?
      (I thought about it for a whole 10 minutes, so please shoot
      bullets at my arguments)?

    - Can people who are familiar with NFILE, NFS, FTAM, etc..,
      characterize them in terms of the "interface problem", above,
      so we can compare them abstractly?

    - Can we come up with a particularly good mix of the 3 approaches
      to solve the problem well for file systems? (did ISO?)  Or is
      blind standardization the only way (it would be disappointing if
      it were) -- just tell everyone to use UNIX?

Any ideas?

- Geof

-----------[000130][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 15:46:12 EST
From:      braden@ISI.EDU (Bob Braden)
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments

Marshall,

  It seems that anything less than universal agreement on a subset of FTAM
  will lead to massive incompatibility among ISO-based implementations.
  Is that wrong?
  
  Bob Braden
  

-----------[000131][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 16:36:01 EST
From:      nowicki@SUN.COM (Bill Nowicki)
To:        mod.protocols.tcp-ip
Subject:   More NFS discussion

Development of real commercial network protocols is full of
compromises, so it is not too suprising that some people think NFS is
too Unix-like, while others think it is not Unix-like enough.
Commercial MS-DOS and VMS products were demonstrated in February 1986,
as well as many different Unix dialects, from PCs to a Cray-2.

Let me try to correct some misconceptions:

	..., NFS has no security or authentication.

Authentication is done at the RPC level in a very open-ended
manner.  The default in the first implementation was to trust UIDs,
since that is all that Unix provides.  A scheme based on public-key
encription has been discussed in papers (Goldberg and Taylor, Usenix
conference 1985).

	... it is assumed that there is a GLOBAL /etc/passwd file

No, your implementation is free to have a simple table that maps these
UIDs into whatever identifier that you use on your system.  We found it
easier to administer by keeping a unique number over as large a domain
as possible.  At some point you may have to translate between numbers
that make sense in your domain, but having N x M translation tables is
not practical to maintain.

	NFS uses very large UDP packets ...

No, details of transport such as packet size are determined by both
the client and server implementations.  Slow machines like PCs use
small transfer sizes, while faster machines such as Sun-3s take
advantage of larger buffer sizes when available.

	So if a complete RPC and NFS can be fit into 64K, why is PC-NFS
	client only?

Although I am not a PC user (luckily I have a Sun-3/75) my
understanding is that MS-DOS (and the Mac, for that matter) can only
run one program at a time.  Therefore if you ran a server, then you
could not run any other programs on that PC. We find in actual practice
that people only put on floppies the files that they DO NOT want other
people to get at - shared files can go on an NFS file system, and you
use MS-DOS "COPY" commands to copy from one PC to another.

Remember that Sun is an active member of the Corporation for Open
Systems, and has ISO and FTAM (second DP version) products.  NFS only
makes many of the file-sharing problems visible - we need to continue
research in this area.  On the other hand, people with inter-operability
questions should come to Uniforum in January 1987 and see it work
for themselves.

	-- Bill Nowicki
	   Sun Microsystems

Above are personal opinions only, not official Sun positions.

-----------[000132][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 18:01:03 EST
From:      lepreau@UTAH-CS.ARPA (Jay Lepreau)
To:        mod.protocols.tcp-ip
Subject:   Re: Telnet - local flow control

Your line numbers will vary.

*** /tmp/,RCSt1001053	Mon Dec 22 15:52:25 1986
--- telnet.c	Fri Aug  1 00:57:49 1986
***************
*** 122,123 ****
--- 122,125 ----
  int	dontlecho = 0;		/* do we suppress local echoing right now? */
+ int	donelclflow = 0;	/* the user has set "localflow" */
+ int	localflow = 0;		/* do xon/xoff flow control locally */
  
***************
*** 639,640 ****
--- 641,651 ----
  			tc = &notc;
+ 		if (!donelclflow)
+ 			localflow = 0;
+ 		if (localflow) {
+ 			tc->t_startc = ntc.t_startc;
+ 			tc->t_stopc = ntc.t_stopc;
+ 		} else {
+ 			tc->t_startc = -1;
+ 			tc->t_stopc = -1;
+ 		}
  		ltc = &noltc;
***************
*** 1534,1535 ****
--- 1545,1553 ----
  
+ lclflow()
+ {
+ 	
+     donelclflow = 1;
+     return 1;
+ }
+ 
  togdebug()
***************
*** 1579,1581 ****
      { "crmod",
! 	"toggle mapping of received carriage returns",
  	    0,
--- 1597,1599 ----
      { "crmod",
! 	"	toggle mapping of received carriage returns",
  	    0,
***************
*** 1590,1591 ****
--- 1608,1615 ----
  			"recognize certain control characters" },
+     { "localflow",
+ 	"toggle local xon/xoff flow control",
+ 	    lclflow,
+ 		1,
+ 		    &localflow,
+ 			"process ^S/^Q locally" },
      { " ", "", 0, 1 },		/* empty line */

-----------[000133][next][prev][last][first]----------------------------------------------------
Date:      Mon, 22-Dec-86 18:06:38 EST
From:      jas@MONK.PROTEON.COM (John A. Shriver)
To:        mod.protocols.tcp-ip
Subject:   re: Need information on NFS

Why no server NFS in PC-NFS?

Because the MS-DOS file processor is not re-entrant, and is single-
threaded. Any attempt to share the file processor between two
processes ranges from hairy to dreadful. It can be done, but you have
to monitor some undocumented "file system busy" or "bios busy" bit. Of
course, some PC software vendors have done this, but they do use a bit
more memory (ex: Vianet), and replace the file processor to do it.

Maybe Sun will do it someday, but it will be hard work, and a memory
pinch. Maybe just wait for "MS-DOS 5.0."

-----------[000134][next][prev][last][first]----------------------------------------------------
Date:      Tue, 23-Dec-86 09:31:26 EST
From:      hwb@MCR.UMICH.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   TFTP boot PROM loader.

Does anyone around here have a TFTP loader for DEC PDP11 tapes images which
could be put into a PROM and which I could get? MACRO-11 assembler flavour
would be best.

	-- Hans-Werner

-----------[000135][next][prev][last][first]----------------------------------------------------
Date:      Tue, 23-Dec-86 10:58:42 EST
From:      brescia@CCV.BBN.COM (Mike Brescia)
To:        mod.protocols.tcp-ip
Subject:   Re: TFTP boot PROM loader.

Hans-werner,

I assume you want to get a pdp11 bytestring file (paper tape format) from some
server host to some pdp11 client.  SRI (the SURAN/packet radio project) has
the original of the loader we use on the 11 gateways, as well as loader-server
programs which run on some hosts.  No, it does not use TFTP, but it gets the
file from here to there, and runs it.

Have you checked for the MIT TFTP Boot server, which was posted on some source
mailing list in the past year?

    Lots of pointers, no solutions.
    Mike

-----------[000136][next][prev][last][first]----------------------------------------------------
Date:      23 Dec 86 10:58:42 EST (Tue)
From:      Mike Brescia <brescia@ccv.bbn.com>
To:        Hans-Werner Braun <hwb@mcr.umich.edu>
Cc:        tcp-ip@sri-nic.ARPA, brescia@ccv.bbn.com
Subject:   Re: TFTP boot PROM loader.
Hans-werner,

I assume you want to get a pdp11 bytestring file (paper tape format) from some
server host to some pdp11 client.  SRI (the SURAN/packet radio project) has
the original of the loader we use on the 11 gateways, as well as loader-server
programs which run on some hosts.  No, it does not use TFTP, but it gets the
file from here to there, and runs it.

Have you checked for the MIT TFTP Boot server, which was posted on some source
mailing list in the past year?

    Lots of pointers, no solutions.
    Mike
-----------[000137][next][prev][last][first]----------------------------------------------------
Date:      Tue, 23-Dec-86 12:21:27 EST
From:      mrose@NRTC-GREMLIN.ARPA (Marshall Rose)
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments


    Well, the obvious answer is to ask "who's doing the agreeing".  I
    know of four such organizations, though there are probably more.

    In Europe, organizations like SPAG have an FTAM profile.  In the
    US, the organization to check with is, of course, the NBS.  John
    Heafner at the NBS spearheads all these kinds of activities, and
    he's the guy you want to ask.  John has an ARPAnet mailbox at the
    NBS, though I don't recall what it is.  In any event, you want the
    notes from the "NBS OSI Implementors' Agreements Workshop".

/mtr

-----------[000138][next][prev][last][first]----------------------------------------------------
Date:      Tue, 23-Dec-86 15:01:06 EST
From:      rhorn@seismo.CSS.GOV@infinet.UUCP (Rob Horn)
To:        mod.protocols.tcp-ip
Subject:   Re: Need information on NFS

I think that the reason PC-NFS is client only has much more to do with
the extreme difficulty in setting up any kind of server under MSDOS
than it has to do with the NFS protocol.  MSDOS is just not suitable
for multi-tasking.  It only understands one process plus N interrupt
service routines.  Any server must act as an interrupt service
routine, and be subject to the associated restrictions, if it is to
coexist with other applications.  This also explains why other vendors
of ftp and telnet also provide client only versions for MSDOS.

-----------[000139][next][prev][last][first]----------------------------------------------------
Date:      Tue, 23-Dec-86 15:03:38 EST
From:      bzs@BU-CS.BU.EDU (Barry Shein)
To:        mod.protocols.tcp-ip
Subject:   NFS


I think there is a misconception brewing here about UNIX file
semantics. It is true that the low level UNIX system calls (eg. OPEN,
READ, WRITE, LSEEK) impose no structure on a file except as a stream
of bytes. This is not peculiar to UNIX, most any O/S that I know of
has some way to just get the bytes off the disk although systems which
prefer structured/typed files tend to resist that and lead the user
towards an access method. Of course, a processor with a sophisticated
IOP (eg. a data base back-end) might be an exception to this rule but
I believe such situations are beyond the scope of this discussion.

Any access method could be layered on top of the UNIX low level calls,
and many have been. Surely I could write DEC's RMS or IBM's access
methods in terms of these simple calls.

As a specific example, consider the UNIX DBM calls which stores
arbitrary data as hashed key/value pairs. This presents the same
problem as any more structured system (eg. how would I fetch the next
key/value pair out of such a file from a remote, non-UNIX system? No
different really than fetching the next ISAM record etc.)

This is somewhat in response to Geoff's note (which was a very good
direction for thought.) I am only saying that the problem is entirely
symmetrical, there is no magic property of access methods whether
built into an O/S or supplied as applications libraries, bytes is
bytes.  The only possible difference is that a system that provides
many access methods might be able to make a list quickly of access
methods which users are probably using (give or take how the users
employed the various options such as record-size, blocking,
bucket-size etc etc.)

I only bring this up so that we don't wring our hands over what I
believe to be a common misconception. Any O/S could (I presume)
present their files as a stream of bytes, the problems would then
be symmetrical.

There are some differences, such as guaranteed atomicity of updates
and types of failure (eg. how extents are handled) but I don't believe
this level of detail is yet where this discussion has found itself
and, I suspect, would be solvable within any scheme that solves the
other, more salient problems.

However, unlike Geoff, I am more pessimistic IN THE GENERAL CASE.

If two systems have a !very! similar access method, such as an ISAM
implementation, then writing an interface between the two should be
relatively straight-forward (although it is still fraught with danger,
eg IBM's V-record format uses 16-bits to express lengths, another
system may not use 16-bits although it supports a V-record format, how
compatible could you make those two access methods?) In the case where
the access method doesn't exist at all I can't see how it could be
utilized at all (oh, I suppose a V-record could be returned to a
text-oriented application as "string<CR><LF>" but that sort of thing
is limited as a solution.)

I won't even mention the Fortran programmer who would like to access
a file full of 128-bit binary floating point values via this NFS (no,
XDR doesn't work unless someone knows it's time to employ it, it still
may not work, does your machine have 128-bit floats?)

I don't think it's insoluble, but I do suspect we will have to be
prescriptive (rather than descriptive) to provide a standard. Given
a standardized menu of network access methods could *you* do your
work?

	-Barry Shein, Boston University

-----------[000140][next][prev][last][first]----------------------------------------------------
Date:      Tue, 23-Dec-86 16:46:16 EST
From:      mrose@NRTC-GREMLIN.ARPA (Marshall Rose)
To:        mod.protocols.tcp-ip
Subject:   Re: NFS comments

Yes, that's right.  These organizations which produce "profiles" actually
talk a lot between themselves to try and maximize harmony.  In the US,
co-operation between MAP/TOP, NBS, and COS has been quite good.  This is really
a chicken-and-egg type thing.  Once you get the critical mass, you're set;
until then you're hanging.  I believe that given the way the NBS has been
guiding things, we've reached the mass and should have many different
implementations in harmony...

/mtr

-----------[000141][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24-Dec-86 03:40:54 EST
From:      sysman%cs.glasgow.ac.uk@CS.UCL.AC.UK.UUCP
To:        mod.protocols.tcp-ip
Subject:   Network number for Glasgow University


Dear Sir,

Would you please advise me on how to go about obtaining an internet
network number for this University. A single class b number or several
class c numbers would do.

Your sincerely
Zdravko Podolski
Computing Dept
University of Glasgow
GLASGOW  G12 8QQ
Scotland

zp@cs.glasgow.ac.uk   or maybe  zp%cs.glasgow.ac.uk@ucl-cs.arpa

-----------[000142][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24 Dec 86 09:25:56 -0500
From:      Craig Partridge <craig@loki.bbn.com>
To:        tcp-ip%sri-nic.arpa@SH.CS.NET
Subject:   Analyzing Acknowledgement Strategies

    Has anyone done any work analyzing acknowledgement strategies on
a theoretical level?

    It seems to me there is an inherent conflict in choosing acknowledgement
strategies.  To reduce unnecessary retransmissions of data packets, one
wants to make sure data packets get acknowledged.  But the best (only?)
way to make sure acknowledgements get through is to send more of them.

    It is easy to identify bad strategies.  A system which sends twenty
acknowledgements for every packet is a bad strategy because the number
of unneeded acknowledgements sent will clog a network more than the
reduction in unnecessary retransmissions (because an ack was lost)
could ever reduce network traffic.  Similarly, a system which acknowledges
too little abuses the network because too many data packets get
unnecessarily retransmitted -- up to a point acknowledging more often
will reduce the *total* number of packets sent.  (At least that's what
testing with RDP shows).

    The problem is identifying where these tradeoffs balance out --
where the range of optimal solutions lies (if anywhere) in this space.
Has anyone ever looked at this issue?

Craig
-----------[000143][next][prev][last][first]----------------------------------------------------
Date:      Wed 24 Dec 86 11:49:43-PST
From:      David L. Kashtan <KASHTAN@SRI-IU.ARPA>
To:        sned@SCRC-PEGASUS.ARPA
Cc:        tcp-ip@SRI-NIC.ARPA
Subject:   Re:  Need information on NFS
		... if one were to replace the part of NFS that does the
	File protocol with NFILE, you wouldn't get any better behaviour with
	regard to the need to implement UN*X pathname syntax on top of the
	foreign filesystem.  Instead of NFS passing you a UN*X-style pathname
	string, you'd have NFILE passing you a UN*X-style pathname string, and
	you'd be no better off.  As I see it, the problem is really the lack
	of support for anything but UN*X filesystem syntax in UN*X.
			.
			.
			.

What I was really trying to say here was that NFS imposed a much too
restrictive model of a file system (one that very clearly has UNIX in
mind).  QFILE/NFILE is considerably less restrictive in its file system
model and seems to me to be a much better starting point for a remote
file system design.  Although the UNIX insistence upon using a "wired-in"
filename syntax is a bit of a bother, it is really quite easy to deal with.
If you were nice enough to replace NFS with NFILE we could easily come to
an agreement on a standard network file system filename representation
to/from which UNIX could transform its filenames in making the remote access.

The problem of filename representation is not just a UNIX problem.  If one
COULD make UNIX act more like a Lisp-Machine in its filename handling
(i.e think of filenames as merely strings) -- and the thought of trying to
do this makes my head hurt too -- that only fixes one of MANY operating
systems.

David
-------
-----------[000144][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24-Dec-86 11:33:59 EST
From:      craig@LOKI.BBN.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Analyzing Acknowledgement Strategies


    Has anyone done any work analyzing acknowledgement strategies on
a theoretical level?

    It seems to me there is an inherent conflict in choosing acknowledgement
strategies.  To reduce unnecessary retransmissions of data packets, one
wants to make sure data packets get acknowledged.  But the best (only?)
way to make sure acknowledgements get through is to send more of them.

    It is easy to identify bad strategies.  A system which sends twenty
acknowledgements for every packet is a bad strategy because the number
of unneeded acknowledgements sent will clog a network more than the
reduction in unnecessary retransmissions (because an ack was lost)
could ever reduce network traffic.  Similarly, a system which acknowledges
too little abuses the network because too many data packets get
unnecessarily retransmitted -- up to a point acknowledging more often
will reduce the *total* number of packets sent.  (At least that's what
testing with RDP shows).

    The problem is identifying where these tradeoffs balance out --
where the range of optimal solutions lies (if anywhere) in this space.
Has anyone ever looked at this issue?

Craig

-----------[000145][next][prev][last][first]----------------------------------------------------
Date:      Wed 24 Dec 86 14:17:09-EST
From:      Dennis G. Perry <PERRY@VAX.DARPA.MIL>
To:        tcp-ip@SRI-NIC.ARPA
Cc:        perry@VAX.DARPA.MIL
Subject:   [MAILER-DAEMON (Mail Delivery Subsystem): Returned mail: User unknown]


The following may be interesting to those who want to keep up with the
ongoing saga of Arpanet performance.

dennis


_________________________________________________________________________

		Arpanet Network Summary

Month				August	September	October

Line Outage (%)			1.47	.64		1.85

NODE:
	All causes (%)		.90	1.30		.68

	Hardware/Software
		Percent		.12	.51		.34

		MT(hours)	167	141		205

		MTTR(hours:min)	1:30	1:50		1:23

TAC:
	All causes(%)		.64	1.44		.99

	Hardware/Software
		Percent		.41	.00		.02

		MTBF(hours)	532	241		196

		MTTR(Hrs:min)	3:23	3:29		1:57

Number of nodes			45	46		46

Average Host Traffic
	(packets per day)	

	Internode		15,678,721	15,115,669	15,479,625

	Intranode		2,347,588	4,157,634	2,575,894

______________________________________________________________________________
-------

-------
-------
-----------[000146][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24-Dec-86 14:26:00 EST
From:      sned@PEGASUS.SCRC.Symbolics.COM (Steven L. Sneddon)
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

[Everything in here is an opinion (mine to be precise).  There may also
be some facts here (I hope so, otherwise I'd better look for a
different line of work).  Is this better, mtr?]

    Date: Sat 20 Dec 86 10:58:38-PST
    From: David L. Kashtan <KASHTAN@SRI-IU.ARPA>

    I am the person who did the VMS NFS implementation, so I think I am
    reasonably qualified to comment on NFS as it relates to
    non-homogeneous O/S environments:

	The VMS NFS implementation is a server-only NFS implementation.  It
	uses the SUN User-Level UNIX NFS implementation and the 4.3BSD- based
	Eunice (in order to provide the necessary UNIX file-system semantics).
	Without Eunice this would have been a very major undertaking.  I would
	most likely have had to re-implement a pretty good sized chunk of the
	Eunice file handling system in order to get NFS to work on VMS.  So, in
	reality, the way to get an NFS up on VMS is to get VMS to pretend that
	it is UNIX.  This is hardly something one would be happy about in a
	standard for non-homogeneous O/S environments.
Agreed.
	  [...]

      It is my feeling that the Lisp Machine NFILE (and its predecessor QFILE)
      remote file access protocols went much further in dealing with file access
      for MANY different types of operating systems and I am very disappointed
      that nobody even looked at them as examples when thinking about NFS.
    David
    -------

I have a problem with this sentence, even though I agree that QFILE and
NFILE can support richer underlying filesystem models than NFS.  My
problem is that if one were to replace the part of NFS that does the
File protocol with NFILE, you wouldn't get any better behaviour with
regard to the need to implement UN*X pathname syntax on top of the
foreign filesystem.  Instead of NFS passing you a UN*X-style pathname
string, you'd have NFILE passing you a UN*X-style pathname string, and
you'd be no better off.  As I see it, the problem is really the lack
of support for anything but UN*X filesystem syntax in UN*X.  

Where the Lisp Machine systems differ is in their ability to accept a
variety of pathname syntaxes, and to convert between them when
necessary (such as when copying directory hierarchies), all the while
sending a "string-for-host" in the syntax of the remote filesystem,
rather than the syntax of the local filesystem.

By the way, it's interesting that Lisp Machines, which were designed
from the beginning to be used as workstations on a network, adopted the
pathname syntax of host:string-for-host for 'open'.  UN*X, which was
designed as a self-contained system, has to indirectly chop a local
pathname, in UN*X pathname syntax, into host and string-for-host via
Special Files and Mount Tables.  That the only thing you could pass to
a UN*X 'open' is a UN*X pathname seems to me to be at the root of the
problem.  When I think about what it would take to change this, my head
starts to hurt [I know my share about UN*X, too].

-----------[000147][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24-Dec-86 15:25:13 EST
From:      PADLIPSKY@A.ISI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   "FTAM" Implications

Depending on just what he meant by it, there are some potentially
intriguing implications lurking in Marshall Rose's statement
the other day that the "FTAM" Draft International Standard doesn't
even use "the same underlying services" as its (presumed)
predecessor Draft Proposal did.  This should be clarified, since if
it turns out to mean either that the principles of Layering
have suddenly been altered (after all, if they were both at/for
the same L, one wouldn't think the underlying services _could_
differ, by definition--unless the trick is that they decided to
go to/from "connectionless"), or that the DP was somehow aimed
at/for the "wrong" L originally, it really ought to cost ISO a
fair amount of credibility.  Maybe Marshall was just speaking more
casually than I'd assumed he was, though.  Whatever the explanation,
I think we should all get to hear it.

On a probably less significant plane, I also wonder, not having
noticed an expansion of "FTAM" anywhere in the message, whether the
"AM" means "Access Method."  (The "FT" is presumably clear from
context.)  If so, is this literally or merely figuratively in the
IBM "OS" (and successors) sense of the term?  If literally, is
the problem with the DIS and the DP perhaps that Access Methods
don't really correspond cleanly to Layers and it was a change of
the arbitrary designation from (I'd imagine, but not bet) 7 to 6
that altered the "underlying services"?
-------

-----------[000148][next][prev][last][first]----------------------------------------------------
Date:      24 Dec 1986 15:25:13 EST
From:      PADLIPSKY@A.ISI.EDU
To:        tcp-ip@SRI-NIC.ARPA
Subject:   "FTAM" Implications
Depending on just what he meant by it, there are some potentially
intriguing implications lurking in Marshall Rose's statement
the other day that the "FTAM" Draft International Standard doesn't
even use "the same underlying services" as its (presumed)
predecessor Draft Proposal did.  This should be clarified, since if
it turns out to mean either that the principles of Layering
have suddenly been altered (after all, if they were both at/for
the same L, one wouldn't think the underlying services _could_
differ, by definition--unless the trick is that they decided to
go to/from "connectionless"), or that the DP was somehow aimed
at/for the "wrong" L originally, it really ought to cost ISO a
fair amount of credibility.  Maybe Marshall was just speaking more
casually than I'd assumed he was, though.  Whatever the explanation,
I think we should all get to hear it.

On a probably less significant plane, I also wonder, not having
noticed an expansion of "FTAM" anywhere in the message, whether the
"AM" means "Access Method."  (The "FT" is presumably clear from
context.)  If so, is this literally or merely figuratively in the
IBM "OS" (and successors) sense of the term?  If literally, is
the problem with the DIS and the DP perhaps that Access Methods
don't really correspond cleanly to Layers and it was a change of
the arbitrary designation from (I'd imagine, but not bet) 7 to 6
that altered the "underlying services"?
-------
-----------[000149][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24-Dec-86 17:33:00 EST
From:      mark@cbosgd.mis.oh.att.com.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: More NFS discussion

In all this discussion of ways that NFS doesn't quite meet the
"UNIX semantics" (this means "UNIX System V semantics", since
UNIX is a trademark of AT&T and AT&T considers only its own
releases to be UNIX) there is one more gotcha lurking.

ANSI X3J11 C has a required function called tmpfile().  This
function takes no arguments, and is defined in the 10/86 draft
section 4.9.4.3 thus:

  Synopsis
	#include <stdio.h>
	FILE *tmpfile(void);
  Description
	The tmpfile function creates a temporary binary file that
	will automatically be removed when it is closed or at program
	termination.  The file is opened for update.
  Returns
	The tmpfile function returns a pointer to the stream of the
	file that is created.  If the file cannot be created, the tmpfile
	function returns a null pointer.

The traditional implementation of this function is to make up a name
in /tmp, create the file, keep it open, and unlink it.  This works fine
on System V and 4BSD.  It probably fails on VMS/Eunice.  As far as I'm
aware, it also fails when /tmp is NFS mounted on a remote system.

Given the stateless nature of NFS, is there a way to upgrade it to
support tmpfile?  If not, it's going to be awfully hard to conform
to ANSI C in an NFS environment.

	Mark Horton

-----------[000150][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24-Dec-86 19:44:06 EST
From:      leiner@ICARUS.RIACS.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: Analyzing Acknowledgement Strategies

Craig,

Of course.  This is basically the same as ARQ on a noisy channel, and
there has been considerable analysis of such things.  Also, comparisons
with forward error correction.

The thing that makes it difficult in the case of packet networks is the
uncertainty on both the error characteristics (which can be handled via
adaptive error detection/correction techniques) and delay.  The latter
is what makes the packet network ARQ problem different than the link
error problem.  On a link, the delay is pretty much constant, and so you
know how long to wait for a time-out.  In a packet network of the sort
we typically deal with, that delay is statistical and must be dealt with
as such (estimated, tracked, use maximum, etc.)

Barry

----------

-----------[000151][next][prev][last][first]----------------------------------------------------
Date:      Wed, 24-Dec-86 20:47:41 EST
From:      Postel@ISI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Network Number Assignments


Hi. 

The way to get a network number assigned for use with the IP protocol
(RFC-791, MIL-STD-1777) is to contact Joyce Reynolds at 

JKReynolds@ISI.EDU,

 or

1-213-822-1511,

 or

USC-Information Sciences Institute
Suite 1001
4676 Admiralty Way
Marina del Rey, CA  90292-6695
USA

--jon.

-----------[000152][next][prev][last][first]----------------------------------------------------
Date:      Thu, 25-Dec-86 00:01:18 EST
From:      deller@seismo.CSS.GOV@vrdxhq.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: vrdxhq!deller
From: deller@vrdxhq.UUCP (Steven Deller)
Newsgroups: mod.protocols.tcp-ip
Subject: Re: Need information on NFS
Summary: UNIX supports other filesystem names better than any other OS
Message-ID: <2688@vrdxhq.UUCP>
Date: 25 Dec 86 05:01:17 GMT
References: <861224142645.6.SNED@MEADOWLARK.SCRC.Symbolics.COM>
Organization: Verdix Corporation, Chantilly, VA
Lines: 54

In article <861224142645.6.SNED@MEADOWLARK.SCRC.Symbolics.COM>, sned@PEGASUS.SCRC.Symbolics.COM (Steven L. Sneddon) writes:
> . . .
> you'd be no better off.  As I see it, the problem is really the lack
> of support for anything but UN*X filesystem syntax in UN*X.  

The UN*X filesystem syntax, at least BSD, with 256 arbitrary characters
per file "level", and up to 4096 arbitrary characters total, appears to
be a superset of any names needed by other OS's.
We have, ourselves, provided a simple VMS pathname to UNIX pathname
converter for keeping a VMS hierarchy on UNIX.  It parses:
  ddnn:[dir1.dir2.dir3]file.ext;ver
into
  ddnn/dir1/dir2/dir3/file.ext;ver

There is special handling if ";ver" is missing (use one greater than the
highest existing, or use ;1).  The only problems we have, is when going from
an arbitrary and "unlimited" naming scheme as BSD UNIX provides, to limited,
heavily structured naming schemes in systems such as VMS.  The solutions
(for example, as found in Eunice or on Apollo machines) are not "pretty"
because those file systems have so many restrictions on the characters allowed,
and the meaning of those characters.

I will admit to not having used SYSV, or other AT&T incantations of UNIX,
so please do not confuse "generic UN*X" with the UNIX I know.

The problem would be removed only if all systems used the same system.  Having
used about 100 systems in my 25 years of programming, I would strongly vote 
for BSD UNIX -- that is, no "operational" naming limitations for a normal user.

> . . . 
> By the way, it's interesting that Lisp Machines, which were designed
> from the beginning to be used as workstations on a network, adopted the
> pathname syntax of host:string-for-host for 'open'.  UN*X, which was
> designed as a self-contained system, has to indirectly chop a local
> pathname, in UN*X pathname syntax, into host and string-for-host via
> Special Files and Mount Tables.  That the only thing you could pass to
> a UN*X 'open' is a UN*X pathname seems to me to be at the root of the
> problem.  When I think about what it would take to change this, my head
> starts to hurt [I know my share about UN*X, too].

Sorry, but putting host names into file names at the application level is a 
step backward.  I want to get to a file "/xxx/yyy/zzz" regardless of where it 
is located today.  And I want the system administrator, or myself, to be able 
to relocate the file to where it makes the most sense, without breaking my 
application.  Examined closely, the ONLY naming scheme that makes sense is a 
strictly independent pure hierarchy, with lots of name freedom.  Now you can 
object to "/" as the separator of the hierarchical names, and you can object to
the method for mapping logical names to physical addresses, but you really 
shouldn't object to an implementation of the only sensible naming approach.

Steven Deller
-- 
<end_message> ::= <disclaimer> | <joke> | <witty_saying> | <cute_graphic>
{verdix,seismo,umcp-cs}!vrdxhq!deller

-----------[000153][next][prev][last][first]----------------------------------------------------
Date:      Thu, 25-Dec-86 02:11:06 EST
From:      deller@seismo.CSS.GOV@vrdxhq.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: vrdxhq!deller
From: deller@vrdxhq.UUCP (Steven Deller)
Newsgroups: mod.protocols.tcp-ip
Subject: Re: Need information on NFS
Summary: BSD UNIX supports other filesystem names better than any other OS
Message-ID: <2690@vrdxhq.UUCP>
Date: 25 Dec 86 07:11:04 GMT
References: <861224142645.6.SNED@MEADOWLARK.SCRC.Symbolics.COM>
Organization: Verdix Corporation, Chantilly, VA
Lines: 58

In article <861224142645.6.SNED@MEADOWLARK.SCRC.Symbolics.COM>, sned@PEGASUS.SCRC.Symbolics.COM (Steven L. Sneddon) writes:
> . . .
> you'd be no better off.  As I see it, the problem is really the lack
> of support for anything but UN*X filesystem syntax in UN*X.  

The UN*X filesystem syntax, at least BSD, with 256 arbitrary characters
per file "level", and up to 4096 arbitrary characters total, appears to
be a superset of any names needed by other OS's.
We have, ourselves, provided a simple VMS pathname to UNIX pathname
converter for keeping a VMS hierarchy on UNIX.  It parses:
  ddnn:[dir1.dir2.dir3]file.ext;ver
into
  ddnn/dir1/dir2/dir3/file.ext;ver

There is special handling if ";ver" is missing (use one greater than the
highest existing, or use ;1).  The only problems we have, is when going from
an arbitrary and "unlimited" naming scheme as BSD UNIX provides, to limited,
heavily structured naming schemes in systems such as VMS.  The solutions
(for example, as found in Eunice or on Apollo machines) are not "pretty"
because those file systems have so many restrictions on the characters allowed,
and the meaning of those characters.

I will admit to not having used SYSV, or other AT&T incantations of UNIX,
so please do not confuse "generic UN*X" with the UNIX I know.

The problem would be removed only if all systems used the same system.  Having
used about 100 systems in my 25 years of programming, I would strongly vote 
for BSD UNIX -- that is, no "operational" naming limitations for a normal user.

> . . . 
> By the way, it's interesting that Lisp Machines, which were designed
> from the beginning to be used as workstations on a network, adopted the
> pathname syntax of host:string-for-host for 'open'.  UN*X, which was
> designed as a self-contained system, has to indirectly chop a local
> pathname, in UN*X pathname syntax, into host and string-for-host via
> Special Files and Mount Tables.  That the only thing you could pass to
> a UN*X 'open' is a UN*X pathname seems to me to be at the root of the
> problem.  When I think about what it would take to change this, my head
> starts to hurt [I know my share about UN*X, too].

Sorry, but putting host names into file names at the application level is a 
step backward.  I want to get to a file "/xxx/yyy/zzz" regardless of where it 
is located today.  And I want the system administrator, or myself, to be able 
to relocate the file to where it makes the most sense, without breaking my 
application.  Examined closely, the ONLY naming scheme that makes sense is a 
strictly independent pure hierarchy, with lots of name freedom.  Now you can 
object to "/" as the separator of the hierarchical names, and you can object to
the method for mapping logical names to physical addresses, but you really 
shouldn't object to an implementation of the only sensible naming approach.

Steven Deller
-- 
<end_message> ::= <disclaimer> | <joke> | <witty_saying> | <cute_graphic>
{verdix,seismo,umcp-cs}!vrdxhq!deller  (Steven Deller)

-- 
<end_message> ::= <disclaimer> | <joke> | <witty_saying> | <cute_graphic>
{verdix,seismo,umcp-cs}!vrdxhq!deller

-----------[000154][next][prev][last][first]----------------------------------------------------
Date:      Thu, 25-Dec-86 17:26:30 EST
From:      kre@seismo.CSS.GOV@munnari.UUCP (Robert Elz)
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS

In <861224142645.6.SNED@MEADOWLARK.SCRC.Symbolics.COM>
sned@PEGASUS.SCRC.Symbolics.COM wrote a lot of nonsense,
followed by one seemingly correct statement...

> [I know my share about UN*X, too].

I'd say that's right - divide all knowledge about Unix by however
many billions of people there are on this planet, and you learned
the bit about how to spell it without violating the trade mark.

> As I see it, the problem is really the lack
> of support for anything but UN*X filesystem syntax in UN*X.  

Since Unix filename syntax is a sequence of chars terminated
by a null (some systems have a maximum length, generally not
less than about 1024 bytes), its hard to see how this is much
of a problem.

> By the way, it's interesting that Lisp Machines, which were designed
> from the beginning to be used as workstations on a network, adopted the
> pathname syntax of host:string-for-host for 'open'.  UN*X, which was
> designed as a self-contained system, has to indirectly chop a local
> pathname, in UN*X pathname syntax, into host and string-for-host via
> Special Files and Mount Tables.

What a load of rubbish.  There have been mny RFS's for Unix at various
times.  Many early ones adopted some form of "host:string" syntax
for remote file names, they ALL died (or have been forgotten), because
that's a REVOLTING method for naming remote files.  That means that
someone has to know that the files is remote to build that filename,
and what's worse has to know which host the file lives on.

Unix implementations don't use mount tables, etc, because that's the only
way it can be done, or even because its the easiest way it can be done.
Just the opposite, its MUCH easier on unix to implement a "host:string"
syntax - an average unix kernel programmer could do one of those (clients)
(given existing network code) in an easy afternoon.   The mount table
mechanism is used because it gives the right semantics - host names aren't
built as a part of the syntax of a filename, they're derived by a level of
indirection that makes it easy to alter the configuration (you can move
local files to a remote host without having to change any uses of the
filenames at all).

I'm not going to comment on Lisp machines, as I've never used one.
From all reports they have a lot of nice attributes, but if Symbolics
standard of employees is confined to "we're right, nothing else comes
close" parrots then I don't have much hope for their continued success.

Robert Elz			kre%munnari.oz@seismo.css.gov

-----------[000155][next][prev][last][first]----------------------------------------------------
Date:      Thu, 25-Dec-86 19:52:00 EST
From:      Margulies@SAPSUCKER.SCRC.SYMBOLICS.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  Need information on NFS


    Date: Thu, 25 Dec 86 17:26:30 EST
    From: munnari!kre@seismo.CSS.GOV (Robert Elz)

    In <861224142645.6.SNED@MEADOWLARK.SCRC.Symbolics.COM>
    sned@PEGASUS.SCRC.Symbolics.COM wrote a lot of nonsense,
    followed by one seemingly correct statement...

I'll leave it to steve to comment on the ad hominem nature of this
little flame, save to point out that it is offensive in the extreme.
Too bad we don't have politeness police.  Slander and libel do exist,
and electronic mail is no license for them.  

I'll concentrate on technical aspects.  However, I will warn Mr. Elz of
this: Steve worked on UN*X for quite a while in his career, and is
eminently qualified to comment.

    > As I see it, the problem is really the lack
    > of support for anything but UN*X filesystem syntax in UN*X.  

    Since Unix filename syntax is a sequence of chars terminated
    by a null (some systems have a maximum length, generally not
    less than about 1024 bytes), its hard to see how this is much
    of a problem.

Consider TOPS-20 directory structure, VM/CMS mini-disks, file system
that permit "/" characters in their filenames, or especially file
systems (like VMS and VM/CMS) that have structured (non-byte stream)
files.  None of them map very quietly into a hierarchical set of
directories separated by "/" characters, and there are more and harder
where they came from.  This is my-file-system-centrism.  Why should a
workstation impose a single model of a file system on all of the
machines it talks to over the network? If I am a workstation users, and
a user of a VMS system sends me mail with a pathname in it, why is it
good that I have to know how to translate it into UN*X ese?  The only
good that I know is that it allows existing UN*X applications, born and
bred in a homogeneous environment, to access files on foreign systems.

I'm not opposed to this.  Steve isn't opposed to this.  Symbolics isn't
opposed to this.  We just note that it imposes some limitations, like
the ones reported by Kasdan.

    > By the way, it's interesting that Lisp Machines, which were designed
    > from the beginning to be used as workstations on a network, adopted the
    > pathname syntax of host:string-for-host for 'open'.  UN*X, which was
    > designed as a self-contained system, has to indirectly chop a local
    > pathname, in UN*X pathname syntax, into host and string-for-host via
    > Special Files and Mount Tables.

    What a load of rubbish.  There have been mny RFS's for Unix at various
    times.  Many early ones adopted some form of "host:string" syntax
    for remote file names, they ALL died (or have been forgotten), because
    that's a REVOLTING method for naming remote files.  That means that
    someone has to know that the files is remote to build that filename,
    and what's worse has to know which host the file lives on.

    Unix implementations don't use mount tables, etc, because that's the only
    way it can be done, or even because its the easiest way it can be done.
    Just the opposite, its MUCH easier on unix to implement a "host:string"
    syntax - an average unix kernel programmer could do one of those (clients)
    (given existing network code) in an easy afternoon.   The mount table
    mechanism is used because it gives the right semantics - host names aren't
    built as a part of the syntax of a filename, they're derived by a level of
    indirection that makes it easy to alter the configuration (you can move
    local files to a remote host without having to change any uses of the
    filenames at all).

This paragraph neatly details my point above: mapping everything to UN*X
syntax is a lot easier on UN*X applications than changing them all to
handle a pathname representation designed to facilitate operations in a
heterogeneous environment.  As it happens, the Symbolics environment
represents pathnames and file system operations in a way that is
optimized to heterogeneous environments.  That was a design goal of
ours, it wasn't of UN*X.

    I'm not going to comment on Lisp machines, as I've never used one.
    From all reports they have a lot of nice attributes, but if Symbolics
    standard of employees is confined to "we're right, nothing else comes
    close" parrots then I don't have much hope for their continued success.

Mr. Elz, please note the history of this conversation.  Someone from
DEC sent mail criticizing NFS for, in a manner of speaking,
UN*X-centrism.  Steve \DEFENDED/ NFS, pointing out that the problem was
merely the small funnel of the pathname syntax (and the lack of
semantics for structured files), and not a fundamental flaw in the
protocol as compared to NFILE.  I hardly call that corporate chauvinism.
If you are going to run about tossing tomatoes like that one, best to be
sure you read your from lines.

  Benson I. Margulies

-----------[000156][next][prev][last][first]----------------------------------------------------
Date:      Thu, 25-Dec-86 23:19:23 EST
From:      bzs@BU-CS.BU.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS


I am surprised that a lot of this discussion has centered around
pathname'ing. It always seemed to me to be one of the easier things
to either fake or punt (fake: use UNIX syntax on a UNIX workstation
as NFS does, punt: use a quoted syntax such as the PUP/Leaf's convention
of {HOST_OR_DEVICE}<any_string_the_other_os_can_interpret>.)

Of course, there is always the possibility of coming up with a
standard, universal catalogue syntax, similar in spirit I guess to the
Library of Congress' universal conventions for finding something.
Then we could all either use that syntax or at least support it.

I always thought it was file formats (access methods) that were the
problem (or have we decided that this is too hopeless to even think
about?)

Maybe we need to make a list of issues, here's mine:

        1. File naming.
        2. Path naming.
        3. File formats and access methods (eg. ISAM, stream...)
        4. File access semantics (eg. atomicity of updates,
           error handling, authorization, etc etc etc.)
       (5. Performance?)

        -Barry Shein, Boston University

-----------[000157][next][prev][last][first]----------------------------------------------------
Date:      Sat, 27-Dec-86 02:24:21 EST
From:      mrose@NRTC-GREMLIN.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: "FTAM" Implications


    I always speak casually, but your inference was correct:  the DP
    FTAM was written at a time when the Presentation Layer was not
    solid enough to use.  It consisted of some encoding mechanisms and
    an abstract syntax methodology, but did not contain the "usual"
    network-style primitives (e.g., OPEN, CLOSE, TRANSFER).  So, the
    "sanctioned" interpretation was:

	- in FTAM you had the presentation encoding mechanisms
	- presentation was NULL
	- session did all the work

    The fact that the DIS uses presentation is not a fundamental change
    in thinking--it merely reflects the fact that the presentation
    specification can now be used.  For those of you familiar with the
    1984 CCITT recommendations on Message Handling Systems, the
    situation is identical (X.409 is used to encode/decode, X.215 is
    used to move bits).

    FTAM is File Transfer, Access, and Management.

/mtr

-----------[000158][next][prev][last][first]----------------------------------------------------
Date:      Sat, 27-Dec-86 03:10:59 EST
From:      gnu@lll-crg.ARPA@hoptoad.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS could support general case pathnames pretty easily...

There has been a migration from (mostly useful) criticism of Sun's NFS
to discussion and criticism of network file systems in general.  This is
OK, we should just not get confused about what we are refuting.

>     > As I see it, the problem is really the lack
>     > of support for anything but UN*X filesystem syntax in UN*X.  
> 
>     Since Unix filename syntax is a sequence of chars terminated
>     by a null (some systems have a maximum length, generally not
>     less than about 1024 bytes), its hard to see how this is much
>     of a problem.
> 
> Consider TOPS-20 directory structure, VM/CMS mini-disks, file system
> that permit "/" characters in their filenames, or especially file
> systems (like VMS and VM/CMS) that have structured (non-byte stream)
> files.  None of them map very quietly into a hierarchical set of
> directories separated by "/" characters, and there are more and harder
> where they came from.

Requiring Unix pathname syntax can be considered a fixable bug in NFS,
and (in an operating system with pretty flexible file names like Berkeley
Unix) need not be an issue in network file system access.

While I was at Sun, I suggested to the NFS group that the basic file
name lookup operations should be passed the entire file name and should
return "whatever part they had not handled".  This would allow the
system which actually implements the disk file structure to parse the
names involved.  For example, on Unix you could do:

	% cd /vm.cms
	% more 'humm zumm a'
	% cd /dg.aos
	% mail <:udd:toad:music:spheres tcp-ip@sri-nic.arpa
	% cd /rsx11m
	% grep meaning '[34,5678]mans.src;4'
or even
	% grep honestman '/vm.cms/whole world a' '/rsx11m/[1,2]buckle.shu'
and presumably if you wanted to spend a lot of time in e.g. a CMS file system,
you could write a "cms shell" that would parse your commands the way the CMS
interpreter does, rather than the way that's convenient for Unix.

However, this idea did not reach the NFS group until they were too far
along to consider it for the first release.  I still think it would be
a good idea, and it could be done by adding another remote procedure to
call with these semantics.  If an NFS client implementation tried this
call and it failed, it could remember that fact, never try it again in
that filesystem, and revert to the old way (parse up to the next "/"
character and call the old "look up name" procedure), providing
backward compatability.

There would still be a few loose ends, e.g. a leading slash (or some
other local convention on non-Unix systems) would still indicate global
rather than current-directory file naming, making it hard to get at
remote files in your current directory that begin with "/".  Also,
programs written for a particular operating system would make
assumptions about how to reach a parent directory, or what characters
are valid in file names, or how to turn "buggy.c" into "buggy.o" or
"hairy.txt" into "hairy.tqt" which would often prevent the use of
arbitrary remote files by those commands.  But no network file system
will solve that problem by itself.

>                                               mapping everything to UN*X
> syntax is a lot easier on UN*X applications than changing them all to
> handle a pathname representation designed to facilitate operations in a
> heterogeneous environment.  As it happens, the Symbolics environment
> represents pathnames and file system operations in a way that is
> optimized to heterogeneous environments.

I think Berkeley did a good enough job, especially given what they
started with and what System V still labors under.

If the filename parsing bugs in Sun's NFS are fixed, what will start
showing up are these application program bugs.  While Lisp is a nice
language, I don't think we should rewrite the world in it to solve our
pathname representation problems.  Better to define some standard hooks
and encourage their availability in a wide variety of environments,
e.g. dir_part(filename) would give the directory name;
file_part(filename) would give the filename within the directory;
parent_of_dir(dirname); file_in_dir(file, dir) would concatenate the
name of the dir and file; dir_in_dir(dir1, dir2) would concatenate two
directory names; root_dir() return the root of the naming scheme, if
any; etc.  I suspect that a set of 10 or 15 of these at most would be
enough to provide a 99% solution to writing applications that don't
embed file-naming-specific information, given an operating system that
supports hierarchy.

Some of these would be tricky, e.g. MSDOS does not implement a true
hierarchy, but has a separate root on each device (e.g. "a:foo" gets foo
in the current directory on disk a, while "a:/foo" gets foo in the root
directory on a, and "/foo" gets foo in the root on the current
disk).  My impression is that VMS is similar, though I've never worked
with it.

Of course, in a system that implemented NFS, these routines would have
to RPC to the remote system at run time to determine its conventions,
so they might not be cheap, but they would be correct.

-----------[000159][next][prev][last][first]----------------------------------------------------
Date:      Sat, 27-Dec-86 10:03:00 EST
From:      Margulies@SAPSUCKER.SCRC.SYMBOLICS.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS could support general case pathnames pretty easily...


    Date: Sat, 27 Dec 86 00:10:59 PST
    From: hoptoad!gnu@lll-crg.ARPA (John Gilmore)

    There has been a migration from (mostly useful) criticism of Sun's NFS
    to discussion and criticism of network file systems in general.  This is
    OK, we should just not get confused about what we are refuting.

First, I was refuting the claim that Symbolics is taking a partisan
position.

Second, I was refuting the claim that just passing string around in an
operating system is the best scheme for handling heterogeneous
pathnames.

Third, I was pointing out that there are two very different purposes for
a network file system under discussion: allowing other computers to
serve as an extension to "your" file system, and allowing access to
other computers' file systems.  

In the first case, which appears to be represented by the existing NFS
implementation, its okay that different hosts disagree on the names of
files, and that file access is constrained by the model of the local
host.   (an aside: I still have trouble seeing why anyone would want to
try to get work done in an environment with 200 workstations, few of
them with significant local disk storage, and many of them disagreeing
on the names of files on the shared resources. If they are all UN*X
boxes, then careful administration can enforce agreement, but still,
accidents will happen.)

In the second case, it just won't do to try to give all files on any
host the same appearance.  The principle should be to allow uniform
access to common capabilities, but also transparent access to particular
ones.

As a standard for allowing any host to supply some file system for
UN*X'es on the network, NFS is fine.  As a standard for allowing
heterogeneous computers to access each other's files as \peers/, the NFS
protocol may still prove fine. The NFS UN*X interface seems to be a
problem, and that's that this discussion is turning toward.

    >     > As I see it, the problem is really the lack
    >     > of support for anything but UN*X filesystem syntax in UN*X.  
    > 
    >     Since Unix filename syntax is a sequence of chars terminated
    >     by a null (some systems have a maximum length, generally not
    >     less than about 1024 bytes), its hard to see how this is much
    >     of a problem.
    > 
    > Consider TOPS-20 directory structure, VM/CMS mini-disks, file system
    > that permit "/" characters in their filenames, or especially file
    > systems (like VMS and VM/CMS) that have structured (non-byte stream)
    > files.  None of them map very quietly into a hierarchical set of
    > directories separated by "/" characters, and there are more and harder
    > where they came from.

    Requiring Unix pathname syntax can be considered a fixable bug in NFS,
    and (in an operating system with pretty flexible file names like Berkeley
    Unix) need not be an issue in network file system access.

My understanding is that the \protocol/ has no pathname syntax
requirements, only the \interface/ through the UN*X file system, which
is not the same thing.  The NFS product may include both, but from the
point of view of us non-UN*X would-be protocol implementors, they are
very different.  That is what Steve Sneddon started out trying to say
when a brick was thrown through his console.

    >                                               mapping everything to UN*X
    > syntax is a lot easier on UN*X applications than changing them all to
    > handle a pathname representation designed to facilitate operations in a
    > heterogeneous environment.  As it happens, the Symbolics environment
    > represents pathnames and file system operations in a way that is
    > optimized to heterogeneous environments.

    I think Berkeley did a good enough job, especially given what they
    started with and what System V still labors under.

In your next paragraph you begin to hit some of the problems with
pathnames represented as a string as opposed to a data structure. The
issue is not C versus Lisp (versus Forth?), it is data structure.

Here is a vaguely language-independent shot at describing Symbolics'
object oriented pathname representation.

You can extract the following fields from any pathname:

    host         (another complex object)
    device       a string or null
    directory	 a list of strings
    name	 a string
    type	 a string or null
    version	 a number, or a keyword for newest, oldest, or the like,
		 or null

In theory, a pathname could have other, host-specific fields.

Any field can have a wild-card.

You can ask for a new pathname based on an old pathname by supplying any
or all of the fields.  

You can do wild-card matching of any pathname against any other.

Note that the particular per-host string format and delimiters are
irrelevant to programs that want to process pathnames.  Only the
pathname system has to be able to parse strings.  The standard parser is
used by all user interfaces to pathnames.

You can ask for a string -- the format of the string depends on the type
of host. 

When you open, the host conditions much of what happens.

We have found this basic scheme (with a lot of complicated details in
the implementation) sufficient to cover all the hosts we have ever
talked to, and that's a lot.  I see no reason other than compatibility
why such a thing couldn't exist on UN*X.  Mind you, I am NOT saying 'you
UN*X hackers should get on the ball and implement this.'  My entire goal
is to point out that we (probably amongst others) have confronted this
issue and have a working implementation, and you might want to check it
out before trying another.  

    If the filename parsing bugs in Sun's NFS are fixed, what will start
    showing up are these application program bugs.  While Lisp is a nice
    language, I don't think we should rewrite the world in it to solve our
    pathname representation problems.

As above, neither do I.  Object oriented programming sure is convienient
for this problem, but certainly not necessary.

    Better to define some standard hooks
    and encourage their availability in a wide variety of environments,
    e.g. dir_part(filename) would give the directory name;
    file_part(filename) would give the filename within the directory;
    parent_of_dir(dirname); file_in_dir(file, dir) would concatenate the
    name of the dir and file; dir_in_dir(dir1, dir2) would concatenate two
    directory names; root_dir() return the root of the naming scheme, if
    any; etc.  I suspect that a set of 10 or 15 of these at most would be
    enough to provide a 99% solution to writing applications that don't
    embed file-naming-specific information, given an operating system that
    supports hierarchy.

    Some of these would be tricky, e.g. MSDOS does not implement a true
    hierarchy, but has a separate root on each device (e.g. "a:foo" gets foo
    in the current directory on disk a, while "a:/foo" gets foo in the root
    directory on a, and "/foo" gets foo in the root on the current
    disk).  My impression is that VMS is similar, though I've never worked
    with it.

    Of course, in a system that implemented NFS, these routines would have
    to RPC to the remote system at run time to determine its conventions,
    so they might not be cheap, but they would be correct.

We solve that problem by expecting the network name server to reveal the
type of host.

One more point:  we also found it convienient to have the equivalent of
the mount table scheme for some purposes.  There is a thing called a
"logical host." A logical host can be mapped to pieces of the hierarchy
of any number of real hosts.  It has a uniform syntax, and thus hides
some functionality of the underlying file systems.  It is useful when
you want to be able to assume the presence of some files at any site.
We use it to locate system files.

-----------[000160][next][prev][last][first]----------------------------------------------------
Date:      Sat, 27-Dec-86 14:17:45 EST
From:      hemphill@nrl-aic.UUCP.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: NFS vs NFILE

Having followed this discussion for a while, I find it strange that no one
has made it explicit that the generic pathname implementation on Symbolics
machines is not really tied to NFILE.  I have been using my Symbolics machine
to access files on a variety of machines (BSD-4.2, IRIS System V, TOPS-20 etc.)
none of which have NFILE servers.  I can quite happily use DIRED, edit my
programs/text, compile my systems, and read/write my data files in exactly
the same manner used for files on the Symbolics file server or my local
Symbolics machine.  In normal use the pathnames look like paths on the target
machines, but that can even hidden by using logical hosts.  In short, this
discussion has gotten sidetracked into the issue of pathname representations
rather than the underlying issues of network filesystems -- data
representation, file attributes, access methods, security etc.

	G++ <hemphill@nrl-aic.arpa>

-----------[000161][next][prev][last][first]----------------------------------------------------
Date:      Sun, 28-Dec-86 07:19:40 EST
From:      jqj@GVAX.CS.CORNELL.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: More NFS discussion

In article <4889@cornell.UUCP> nowicki@Sun.COM (Bill Nowicki) writes:
>Let me try to correct some misconceptions:
>	..., NFS has no security or authentication.
>Authentication is done at the RPC level in a very open-ended
>manner.  The default in the first implementation was to trust UIDs,
>since that is all that Unix provides.  A scheme based on public-key
>encription has been discussed in papers (Goldberg and Taylor, Usenix
>conference 1985).

Although Bill is technically correct, it still seems fair to say that
"NFS has no security or authentication" and that this is a VERY serious
weakness of the SUN NFS standard.  SUN RPC is open ended in this regard,
but the only form of authentication standardized is "UNIX-style" i.e.
none.  Since SMI has not officially endorsed the Goldberg&Taylor scheme,
the situation is far worse than a simple lack of implementation.  The
fact remains that one can easily break security on any SUN NFS cluster 
if he/she has access to any diskless client.

More abstractly, it is arguable that authentication at the RPC level is
inappropriate for application-level security.  It requires that the
application (NFS) have a much closer coupling than I would like to the
transport mechanism (RPC).  As an example of the problems that this
confusion of layering causes, consider how you'd handle a file system type
that required secondary authentication, e.g. a Tops-20 like system that
had both user id's and file "accounts", with perhaps a password associated
with the account -- seems to me your NFS-level authentication scheme must
of necessity be specific to the particular type of remote file system if
you want file security equal to what you have in a local file system.

For comparison, consider Authentication in the Xerox Filing (distributed
file system) protocol, which is much more robust than anything I've
seen considered as an NFS extension (but which still has major flaws...).
See "Authentication Protocol", XSIS 098404, and "Filing Protocol",
XNSS 108605.

-----------[000162][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29-Dec-86 03:24:26 EST
From:      news@seismo.CSS.GOV@sun.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: sun!gorodish!guy
From: guy%gorodish@Sun.COM (Guy Harris)
Newsgroups: mod.protocols.tcp-ip
Subject: Re: More NFS discussion
Summary: OK, what *about* "tmpfile"?
Message-ID: <10850@sun.uucp>
Date: 29 Dec 86 08:24:25 GMT
References: <8612242233.AA06278@cbosgd.MIS.OH.ATT.COM>
Sender: news@sun.uucp
Lines: 20

> In all this discussion of ways that NFS doesn't quite meet the
> "UNIX semantics" (this means "UNIX System V semantics", since
> UNIX is a trademark of AT&T and AT&T considers only its own
> releases to be UNIX) there is one more gotcha lurking.
> 
> The traditional implementation of (tmpfile) is to make up a name
> in /tmp, create the file, keep it open, and unlink it.  This works fine
> on System V and 4BSD.  It probably fails on VMS/Eunice.  As far as I'm
> aware, it also fails when /tmp is NFS mounted on a remote system.

Well, you aren't aware of how the Sun NFS implementation works; it works
just fine when "/tmp" is NFS mounted on a remote system.  (I just tried it,
with "tmpfile" - "tmpnam", actually - modified to use an NFS-mounted file
system.)  The local kernel knows that the file is open, and instead of
unlinking it, it renames it to a temporary name and deletes the temporary
file when the last program on the client that created it finishes with it.

Yes, this will fail if a process on a *different* machine unlinks it.
However, it's difficult for another such process to unlink it under UNIX,
since it only has a name for a very short time.

-----------[000163][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29-Dec-86 03:32:06 EST
From:      news@seismo.CSS.GOV@sun.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: sun!gorodish!guy
From: guy%gorodish@Sun.COM (Guy Harris)
Newsgroups: mod.protocols.tcp-ip
Subject: Re: Need information on NFS
Summary: Pathname syntax/semantics and file formats are two separate issues.
Message-ID: <10851@sun.uucp>
Date: 29 Dec 86 08:32:06 GMT
References: <8612250637.AA05517@seismo.CSS.GOV> <861225195214.5.MARGULIES@REDWING.SCRC.Symbolics.COM>
Sender: news@sun.uucp
Lines: 15

> Consider TOPS-20 directory structure, VM/CMS mini-disks, file system
> that permit "/" characters in their filenames, or especially file
> systems (like VMS and VM/CMS) that have structured (non-byte stream)
> files.  None of them map very quietly into a hierarchical set of
> directories separated by "/" characters, and there are more and harder
> where they came from.

The problem with VMS pathnames has nothing whatsoever to do with the fact
that some operating systems keep file attributes like file format around and
that a certain level of those OSes (not the kernel, in the case of VMS, but
RMS) imposes a certain interpretation of the bytes in the file, based on
these file attributes, on its clients.  Those are separate issues; you can
build a system with a VMS-style pathname syntax but with no interpretation
of the file's contents below user-mode code, or a system with UNIX-style
pathnames but with VMS-style file attributes handled by something like RMS.

-----------[000164][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29-Dec-86 08:44:00 EST
From:      Margulies@SAPSUCKER.SCRC.SYMBOLICS.COM.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip


    Date: 29 Dec 86 08:32:06 GMT
    From: sun!news@seismo.CSS.GOV

    Path: sun!gorodish!guy
    From: guy%gorodish@Sun.COM (Guy Harris)
    Newsgroups: mod.protocols.tcp-ip
    Subject: Re: Need information on NFS
    Summary: Pathname syntax/semantics and file formats are two separate issues.
    Message-ID: <10851@sun.uucp>
    Date: 29 Dec 86 08:32:06 GMT
    References: <8612250637.AA05517@seismo.CSS.GOV> <861225195214.5.MARGULIES@REDWING.SCRC.Symbolics.COM>
    Sender: news@sun.uucp
    Lines: 15

    > Consider TOPS-20 directory structure, VM/CMS mini-disks, file system
    > that permit "/" characters in their filenames, or especially file
    > systems (like VMS and VM/CMS) that have structured (non-byte stream)
    > files.  None of them map very quietly into a hierarchical set of
    > directories separated by "/" characters, and there are more and harder
    > where they came from.

    The problem with VMS pathnames has nothing whatsoever to do with the fact
    that some operating systems keep file attributes like file format around and
    that a certain level of those OSes (not the kernel, in the case of VMS, but
    RMS) imposes a certain interpretation of the bytes in the file, based on
    these file attributes, on its clients.  Those are separate issues; you can
    build a system with a VMS-style pathname syntax but with no interpretation
    of the file's contents below user-mode code, or a system with UNIX-style
    pathnames but with VMS-style file attributes handled by something like RMS.

Indeed. Perhaps if I changed the last sentence to:

  "None of them map very quietly into a hierarchical set of directories
separated by "/" characters where all of the files are expected to be
simple vectors of 8bit bytes"

you would like it better.  They are separate problems.  They are both
hard.

-----------[000165][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29-Dec-86 09:12:00 EST
From:      root@suny-sb.CSNET.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS, small machines, and ...

The size issue of NFS on "small" machines heard recently in this group 
does not wash - our implementation of NFS on the Commodore Amiga PC fits 
nicely in about 33K code (10K NFS, 7.6K RPC/XDR, 15K TCP/UDP/IP, .6K User
authentication) + network buffers (~32K max).  AmigaNFS runs
quite nicely on a 256K Amiga.

As to the issue of why NFS server is not done on PCs, it seems to be
more an issue of host filesystem performance & required functionality
than anything else.  Using the Amiga as an example, the problems with
implementing NFS server are:

	1.  AmigaDOS filesystem performance is only about 32K bytes/sec,
	    and cannot really adequately service more than one (any?)
	    user.

	2.  AmigaDOS does not support anything like file generation
	    numbers, needed for server crash recovery.  Remember that NFS
	    boils file/dir names down into a short hand description
	    called a filehandle.  To keep the server completely stateless, 
	    the process of transforming file/dir name -> filehandle must be 
	    reversible.

	3.  AmigaDOS uses object locking to refer to directories & files.
	    Since NFS is designed to be stateless (and indepotent), we
	    have no open/close calls to delimit the lifetime of a lock.
	    A partial fix can be had by saving complete path names and
	    running all service requests atomically (lock/examine/unlock)
	    but this implies server state.

I believe that comments (1) and (2) apply in principle to most small
systems available today.  

					Rick Spanbauer

-----------[000166][next][prev][last][first]----------------------------------------------------
Date:      29 Dec 1986 09:12-EST
From:      Root <root%suny-sb.csnet@RELAY.CS.NET>
To:        tcp-ip@SRI-NIC.ARPA
Cc:        joes%suny-sb.csnet@RELAY.CS.NET
Subject:   NFS, small machines, and ...
The size issue of NFS on "small" machines heard recently in this group 
does not wash - our implementation of NFS on the Commodore Amiga PC fits 
nicely in about 33K code (10K NFS, 7.6K RPC/XDR, 15K TCP/UDP/IP, .6K User
authentication) + network buffers (~32K max).  AmigaNFS runs
quite nicely on a 256K Amiga.

As to the issue of why NFS server is not done on PCs, it seems to be
more an issue of host filesystem performance & required functionality
than anything else.  Using the Amiga as an example, the problems with
implementing NFS server are:

	1.  AmigaDOS filesystem performance is only about 32K bytes/sec,
	    and cannot really adequately service more than one (any?)
	    user.

	2.  AmigaDOS does not support anything like file generation
	    numbers, needed for server crash recovery.  Remember that NFS
	    boils file/dir names down into a short hand description
	    called a filehandle.  To keep the server completely stateless, 
	    the process of transforming file/dir name -> filehandle must be 
	    reversible.

	3.  AmigaDOS uses object locking to refer to directories & files.
	    Since NFS is designed to be stateless (and indepotent), we
	    have no open/close calls to delimit the lifetime of a lock.
	    A partial fix can be had by saving complete path names and
	    running all service requests atomically (lock/examine/unlock)
	    but this implies server state.

I believe that comments (1) and (2) apply in principle to most small
systems available today.  

					Rick Spanbauer

-----------[000167][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29-Dec-86 09:37:00 EST
From:      Margulies@SAPSUCKER.SCRC.SYMBOLICS.COM (Benson I. Margulies)
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip


	Date: 29 Dec 86 08:32:06 GMT
	From: sun!news@seismo.CSS.GOV

	Path: sun!gorodish!guy
	From: guy%gorodish@Sun.COM (Guy Harris)
	Newsgroups: mod.protocols.tcp-ip
	Subject: Re: Need information on NFS
	Summary: Pathname syntax/semantics and file formats are two separate issues.
	Message-ID: <10851@sun.uucp>
	Date: 29 Dec 86 08:32:06 GMT
	References: <8612250637.AA05517@seismo.CSS.GOV> <861225195214.5.MARGULIES@REDWING.SCRC.Symbolics.COM>
	Sender: news@sun.uucp
	Lines: 15

	> Consider TOPS-20 directory structure, VM/CMS mini-disks, file system
	> that permit "/" characters in their filenames, or especially file
	> systems (like VMS and VM/CMS) that have structured (non-byte stream)
	> files.  None of them map very quietly into a hierarchical set of
	> directories separated by "/" characters, and there are more and harder
	> where they came from.

	The problem with VMS pathnames has nothing whatsoever to do with the fact
	that some operating systems keep file attributes like file format around and
	that a certain level of those OSes (not the kernel, in the case of VMS, but
	RMS) imposes a certain interpretation of the bytes in the file, based on
	these file attributes, on its clients.  Those are separate issues; you can
	build a system with a VMS-style pathname syntax but with no interpretation
	of the file's contents below user-mode code, or a system with UNIX-style
	pathnames but with VMS-style file attributes handled by something like RMS.

    Indeed. Perhaps if I changed the last sentence to:

      "None of them map very quietly into a hierarchical set of directories
    separated by "/" characters where all of the files are expected to be
    simple vectors of 8bit bytes"

    you would like it better.  They are separate problems.  They are both
    hard.

 

-----------[000168][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29-Dec-86 13:18:08 EST
From:      GKN@SDSC-SDS.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   4.3 BSD TELNET vs. <CR>

[apologies if this has been brought up before, or if this is the wrong list
 to direct this query to...]

It seems that when the TELNET client program for 4.3 BSD Unix is in character
at a time mode it substitutes a <LF> for <CR>.  While this doesn't seem to be
in clear violation of RFC 854, it does seem a little strange to me.  RFC 854
simply says that a <NUL> should be inserted in the data stream after every <CR>
character which would not be followed by a <LF>.

There are operating systems out there where the <CR> character is important,
VMS and CTSS being two examples I have locally.  The <LF> character has a
legitimate, non-end-of-line use on each system.

Reading the manual page for 4.3 BSD TELNET doesn't seem to indicate that there
is any way to supress this behavior except to operate in line-at-a-time mode.

Is there someone out there who can tell me how to cause 4.3 BSD TELNET to
transmit a <CR><LF> (or <CR><NUL> a la RFC 854) when I type a <CR>?

gkn
--------------------------------------
Arpa:	GKN@SDSC.ARPA
Bitnet:	GKN@SDSC
Span:	SDSC::GKN (5.600)
USPS:	Gerard K. Newman
	San Diego Supercomputer Center
	P.O. Box 85608
	San Diego, CA 92138
AT&T:	619.534.5076
-------

-----------[000169][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29-Dec-86 16:16:31 EST
From:      CRB@NIHCUDEC.BITNET.UUCP
To:        mod.protocols.tcp-ip
Subject:   Those big, big books

The books specifying the TCP/IP and related protocols have been described as
having titles like RFC 791, etc.  Are these titles exact, or how should these
books be titled to make sense to our purchasing people?  And the publisher:
should SRI be referred to as Stanford Research Institute, and should the
name of the DDN be spelled out?  Sorry to sound stupid, but better to get
too many questions answered than too few.  Thanks for the help.

Charles Bacon   (301)496-4823, or, Bldg. 12B, National Insts. of Health,
                                   Bethesda, Md. 20892
        or even CRB@NIHCUDEC.bitnet

-----------[000170][next][prev][last][first]----------------------------------------------------
Date:      Mon, 29 Dec 86  16:16:31  EST
From:      "C. BACON"  <CRB%NIHCUDEC.BITNET@WISCVM.WISC.EDU>
To:        <tcp-ip@sri-nic.arpa>
Subject:   Those big, big books
The books specifying the TCP/IP and related protocols have been described as
having titles like RFC 791, etc.  Are these titles exact, or how should these
books be titled to make sense to our purchasing people?  And the publisher:
should SRI be referred to as Stanford Research Institute, and should the
name of the DDN be spelled out?  Sorry to sound stupid, but better to get
too many questions answered than too few.  Thanks for the help.

Charles Bacon   (301)496-4823, or, Bldg. 12B, National Insts. of Health,
                                   Bethesda, Md. 20892
        or even CRB@NIHCUDEC.bitnet
-----------[000171][next][prev][last][first]----------------------------------------------------
Date:      Mon 29 Dec 86 18:18:08-GMT
From:      Gerard K. Newman <GKN@SDSC-SDS.ARPA>
To:        TCP-IP@SRI-NIC.ARPA
Subject:   4.3 BSD TELNET vs. <CR>
[apologies if this has been brought up before, or if this is the wrong list
 to direct this query to...]

It seems that when the TELNET client program for 4.3 BSD Unix is in character
at a time mode it substitutes a <LF> for <CR>.  While this doesn't seem to be
in clear violation of RFC 854, it does seem a little strange to me.  RFC 854
simply says that a <NUL> should be inserted in the data stream after every <CR>
character which would not be followed by a <LF>.

There are operating systems out there where the <CR> character is important,
VMS and CTSS being two examples I have locally.  The <LF> character has a
legitimate, non-end-of-line use on each system.

Reading the manual page for 4.3 BSD TELNET doesn't seem to indicate that there
is any way to supress this behavior except to operate in line-at-a-time mode.

Is there someone out there who can tell me how to cause 4.3 BSD TELNET to
transmit a <CR><LF> (or <CR><NUL> a la RFC 854) when I type a <CR>?

gkn
--------------------------------------
Arpa:	GKN@SDSC.ARPA
Bitnet:	GKN@SDSC
Span:	SDSC::GKN (5.600)
USPS:	Gerard K. Newman
	San Diego Supercomputer Center
	P.O. Box 85608
	San Diego, CA 92138
AT&T:	619.534.5076
-------
-----------[000172][next][prev][last][first]----------------------------------------------------
Date:      Tue, 30-Dec-86 10:42:00 EST
From:      PADLIPSKY@A.ISI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re: "FTAM" Implications

In response to your message sent  Fri, 26 Dec 86 21:00:25 -0800

I'd feel myself to have been remiss if I didn't observe that
the explanation of why the FTAM DIS is inconsistent with the
FTAM DP exposes at least a fundamental flaw in ISO's committee
structure and arguably one (or more) in the "Reference Model"
itself, but on reflection I'd feel I was wasting everybody's
time if I bothered to spell it out in any detail--it ought to be
nearly obvious anyway.  Suffice it to say that it's probably
impossible to do Top-down and Bottom-up simultaneously, especially
if two (or more) teams are involved, each thinking itself to be
in charge.  (I will take another I Told You So on my old "It's
Layer [sic] 5-7" line, though, and it might not be too pushy to
insist on one for the Slogan that begins "The more Layers, the
more committees.")
   rueful cheers, map
-------

-----------[000173][next][prev][last][first]----------------------------------------------------
Date:      30 Dec 1986 10:42:00 EST
From:      PADLIPSKY@A.ISI.EDU
To:        mrose@NRTC-GREMLIN.ARPA
Cc:        tcp-ip@SRI-NIC.ARPA
Subject:   Re: "FTAM" Implications
In response to your message sent  Fri, 26 Dec 86 21:00:25 -0800

I'd feel myself to have been remiss if I didn't observe that
the explanation of why the FTAM DIS is inconsistent with the
FTAM DP exposes at least a fundamental flaw in ISO's committee
structure and arguably one (or more) in the "Reference Model"
itself, but on reflection I'd feel I was wasting everybody's
time if I bothered to spell it out in any detail--it ought to be
nearly obvious anyway.  Suffice it to say that it's probably
impossible to do Top-down and Bottom-up simultaneously, especially
if two (or more) teams are involved, each thinking itself to be
in charge.  (I will take another I Told You So on my old "It's
Layer [sic] 5-7" line, though, and it might not be too pushy to
insist on one for the Slogan that begins "The more Layers, the
more committees.")
   rueful cheers, map
-------
-----------[000174][next][prev][last][first]----------------------------------------------------
Date:      Tue, 30-Dec-86 11:45:45 EST
From:      ron@BRL.ARPA.UUCP
To:        mod.protocols.tcp-ip
Subject:   Re:  4.3 BSD TELNET vs. <CR>

The problem occurs only in local echo mode and is a bug in the code
rather than a design feature....


Description:
	TELNET sends \n in local echo mode rather than \r\n when
	\r is typed.
Repeat-By:
	Get into mode 3, character at a time with local echo in
	telnet and then type \r.
Fix:
	The code that decides what the user really meant when he
	the input character is \n has a bug.  It checks to see if
	the user was in CRMOD which would imply that he really
	pressed return rather than linefeed.  The test however is
	defective in that mode 2 has CRMOD set as well.

*** telnet.c	Tue Dec 23 21:50:35 1986
--- /tmp/foo	Tue Dec 30 11:29:50 1986
***************
*** 983,989 ****
  				 * on our local machine, then probably
  				 * a newline (unix) is CRLF (TELNET).
  				 */
! 				if (globalmode >= 3) {
  					NETADD('\r');
  				}
  				NETADD('\n');
--- 983,989 ----
  				 * on our local machine, then probably
  				 * a newline (unix) is CRLF (TELNET).
  				 */
! 				if (globalmode >= 2) {
  					NETADD('\r');
  				}
  				NETADD('\n');

-----------[000175][next][prev][last][first]----------------------------------------------------
Date:      Tue, 30-Dec-86 13:59:08 EST
From:      GKN@SDSC.ARPA (Gerard K. Newman)
To:        mod.protocols.tcp-ip
Subject:   Re:  Re:  4.3 BSD TELNET vs. <CR>


	From:	 Ron Natalie <ron@BRL.ARPA>
	Subject: Re:  4.3 BSD TELNET vs. <CR>
	Date:	 Tue, 30 Dec 86 11:45:45 EST

	The problem occurs

-----------[000176][next][prev][last][first]----------------------------------------------------
Date:      Tue, 30-Dec-86 15:28:00 EST
From:      WELTYC%cievms@CSV.RPI.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   NFS Discussion, File Info from VMS


	In this discussion, mention of what information about a file
is necessary for a file system to include was made.  Well, I'm sure most
of us know what UNIX keeps about its files, but here's what VMS keeps:

----
Directory DUA0:[WELTYC.BS]

SFMSS.TXT;16                  File ID:  (7356,8,0)         
Size:            5/6          Owner:    [STAFF,WELTYC]
Created:  18-JUL-1986 10:43   Revised:  18-JUL-1986 10:43 (1)
Expires:   <None specified>   Backup:   10-NOV-1986 12:09
File organization:  Sequential
File attributes:    Allocation: 6, Extend: 0, Global buffer count: 0, No version limit
Record format:      Variable length, maximum 79 bytes
Record attributes:  Carriage return carriage control
Journaling enabled: None
File protection:    System:RWED, Owner:RWED, Group:RE, World:
Access Cntrl List:  None

Total of 1 file, 5/6 blocks.
-----

	I don't know much about the file structure of other non-UNIX systems, 
but I find this much information a bother, especially in trying to bring in 
files (on magtape, say) from other systems.  I would not advocate using this 
much information in some "standard" filesystem structure, but in a filesystem
whose goal is to be heterogeneous, this information needs to be provided for 
as well.  
	The only field here I find really useful is the Access Control List 
(ACL).  One of the many features of this is to allow you to specify specific
users or groups that can access a file or device.  This is an idea foreign to
UNIX (and NFS, too, I think).  A couple people have expressed dissatisfaction
with the protection schemes (or lack of) in NFS, how does the VMS NFS handle 
this stuff?

					-Chris
					 weltyc@csv.rpi.edu

-----------[000177][next][prev][last][first]----------------------------------------------------
Date:      Tue, 30-Dec-86 21:28:40 EST
From:      @EDDIE.MIT.EDU:JBS@DEEP-THOUGHT.MIT.EDU
To:        mod.protocols.tcp-ip
Subject:   Becomming an Internet site

How does one go about getting a net connection.  I'm talking here
about a company I am familiar with which does data processing and
software development for the US Gov't and would like to transfer data
to and from gov't computers, and to communicate with gov't personnel
via electronic mail.

Please reply directly to jbs@eddie.mit.edu.

Jeff Siegal
-------

-----------[000178][next][prev][last][first]----------------------------------------------------
Date:      Tue 30 Dec 86 18:59:08-GMT
From:      Gerard K. Newman <GKN@SDSC.ARPA>
To:        ron@BRL.ARPA
Cc:        GKN@SDS.SDSC.EDU, TCP-IP@SRI-NIC.ARPA
Subject:   Re:  Re:  4.3 BSD TELNET vs. <CR>
	From:	 Ron Natalie <ron@BRL.ARPA>
	Subject: Re:  4.3 BSD TELNET vs. <CR>
	Date:	 Tue, 30 Dec 86 11:45:45 EST

	The problem occurs only in local echo mode and is a bug in the code
	rather than a design feature....

Thanks for the description of the problem and the fix, but from my observation
the problem also occurs when the connection is in non-local echo mode as well.

gkn
--------------------------------------
Arpa:	GKN@SDSC.ARPA
Bitnet:	GKN@SDSC
Span:	SDSC::GKN (5.600)
USPS:	Gerard K. Newman
	San Diego Supercomputer Center
	P.O. Box 85608
	San Diego, CA 92138
AT&T:	619.534.5076
-------
-----------[000179][next][prev][last][first]----------------------------------------------------
Date:      Wed, 31-Dec-86 02:38:36 EST
From:      brad@seismo.CSS.GOV@sun.UUCP
To:        mod.protocols.tcp-ip
Subject:   Submission for mod-protocols-tcp-ip

Path: sun!brad
From: brad@sun.uucp (Brad Taylor)
Newsgroups: mod.protocols.tcp-ip
Subject: Re: More NFS discussion
Summary: Clearing the air about authentication and NFS
Message-ID: <10911@sun.uucp>
Date: 31 Dec 86 07:38:36 GMT
References: <4889@cornell.UUCP> <8612281219.AA16219@gvax.cs.cornell.edu>
Organization: Sun Microsystems, Inc.
Lines: 79

There are many misconceptions about authentication. Let my try to
clear them up.

> In article <4889@cornell.UUCP> nowicki@Sun.COM (Bill Nowicki) writes:
> >Let me try to correct some misconceptions:
> >	..., NFS has no security or authentication.
> >Authentication is done at the RPC level in a very open-ended
> >manner.  The default in the first implementation was to trust UIDs,
> >since that is all that Unix provides.  A scheme based on public-key
> >encription has been discussed in papers (Goldberg and Taylor, Usenix
> >conference 1985).
 
> Although Bill is technically correct, it still seems fair to say that
> "NFS has no security or authentication" and that this is a VERY serious
> weakness of the SUN NFS standard.  

The statement "NFS has no security or authentication" is simply not true.
There *is* a limited form of security with UNIX authentication. If you
trust all of the nodes connected to the network, then the entire system
is secure. We know this assumption is unreasonable in many cases, and
that is why we proposed the alternative system. In the alternative system,
you do not have to trust your network. It also fixes the other problem
with UNIX authentication, that being that it is to UNIX-oriented (good
name for the authentication system, eh?). In the alternative scheme, users
are not reference by a uid, but by a string of characters only and a
database is used by the server to convert from the string into the particular
form of authentication required by the OS. For UNIX, it's uids, but 
other operating systems are free to map it into something else.

>	SUN RPC is open ended in this regard,
> but the only form of authentication standardized is "UNIX-style" i.e.
> none.  Since SMI has not officially endorsed the Goldberg&Taylor scheme,
> the situation is far worse than a simple lack of implementation.  The
> fact remains that one can easily break security on any SUN NFS cluster 
> if he/she has access to any diskless client.

Wrong. The situation is no more than a lack of implementation. And again
one must have more than mere access to a client machine to break security
with the UNIX form of authentication: you must have root access to do it. 
(With the proviso being that, on Unix at least, init is changed to prompt 
for a password during single user boots: a simple hack).

> More abstractly, it is arguable that authentication at the RPC level is
> inappropriate for application-level security.  It requires that the
> application (NFS) have a much closer coupling than I would like to the
> transport mechanism (RPC).  

First of all, RPC is not a transport protocol. Secondly, it makes a lot
of sense to put authentication at the RPC level. Think of the analogy
to transport protocols, where you can think of the source address as
a primitive form of authentication (and used as such by many network
applications).  Also, implementation-wise, since all of the RPC 
applications have the RPC code in common, it is a natural place to put 
the other thing they have in common here too. I don't feel this is a 
very important issue though. What's far more important than
layering is how secure and general purpose the authentication system is.

> As an example of the problems that this
> confusion of layering causes, consider how you'd handle a file system type
> that required secondary authentication, e.g. a Tops-20 like system that
> had both user id's and file "accounts", with perhaps a password associated
> with the account -- seems to me your NFS-level authentication scheme must
> of necessity be specific to the particular type of remote file system if
> you want file security equal to what you have in a local file system.

Nothing in the RPC protocol prevents you from implementing your
TOPS-20 authentication scheme, since RPC can support several authentication
schemes.  We aim for more general purpose authentication systems than this, 
though.

> For comparison, consider Authentication in the Xerox Filing (distributed
> file system) protocol, which is much more robust than anything I've
> seen considered as an NFS extension (but which still has major flaws...).
> See "Authentication Protocol", XSIS 098404, and "Filing Protocol",
> XNSS 108605.

Well, if you like the XEROX protocol so much, then you'll should like our 
proposal too because it's pretty similar. (By the way, the date on the 
paper describing the alternative system is 1986, not 1985 as stated above).

-----------[000180][next][prev][last][first]----------------------------------------------------
Date:      Wed, 31 Dec 86 13:13:01 PST
From:      Andy Freeman <andy@shasta.stanford.edu>
To:        mod.protocols.tcp-ip
Subject:   Re: Those big, big books
In article <8612310227.AA03918@ucbvax.Berkeley.EDU> you write:
>							 And the publisher:
>should SRI be referred to as Stanford Research Institute, and should the
>name of the DDN be spelled out?
>
>Charles Bacon   (301)496-4823, or, Bldg. 12B, National Insts. of Health,
>                                   Bethesda, Md. 20892
>        or even CRB@NIHCUDEC.bitnet

SRI is not an acronym.  It is short for SRI International.  SRI
was spun-off from Stanford more than 15 years ago and one of the
conditions of the spin-off is that SRI can't use Stanford in its
name.

-andy
-- 
Andy Freeman
UUCP:  ...!decwrl!shasta!andy forwards to
ARPA:  andy@sushi.stanford.edu
(415) 329-1718/723-3088 home/cubicle
-----------[000181][next][prev][last][first]----------------------------------------------------
Date:      Wed, 31-Dec-86 14:24:59 EST
From:      PALLAS@Sushi.Stanford.EDU.UUCP
To:        mod.protocols.tcp-ip
Subject:   No more NFS flames

Perhaps we can all agree on a few things, and move on to something
more productive.  Please note the distinction between concept and
product, which seems to have been the cause of some hot blood.

Sun NFS the concept, supported by Sun RPC the concept, has an
open-ended authentication scheme which allows for any number of
authentication systems, which may be as secure as current technology
allows.

Sun NFS the product, supported by Sun RPC the product, provides only
one authentication system, which is laughably insecure (but not much
less secure than Berkeley's rlogin scheme).  Some improvement in the
security could be achieved by making client machines more secure, but
Sun doesn't currently deliver systems with that improvement.

Sun NFS the product is becoming a {\it de facto} standard, with Sun's
encouragement.  That standard (the product) does not have a
non-trivial authentication system.

Sun NFS the concept supports only the sequence-of-bytes file model.

Sun RPC is not a transport protocol, it is a presentation/session
protocol, if you insist on trying to use the ISO model.  The model
described by Watson in {\it Distributed Systems---Architecture and
Implementation} is better suited to this discussion, in my opinion.
RPC falls into the "service support sublayer" in that model.

Now, the REAL issues are these:

1.  Security is important when connecting autonomous systems.
2.  Heterogeneous systems tend to present different interfaces to
    similar services.

These are not new ideas; they are, in fact, quite ancient.

We should be devoting our efforts to exploring solutions to the
problems that are posed by these issues in the context of computer
systems and networks.  Debating the merits of a particular product is
not productive.

joe

P.S.  If there's a factual error in the above, I'd appreciate being
informed of it.
-------

-----------[000182][next][prev][last][first]----------------------------------------------------
Date:      Wed, 31-Dec-86 16:13:01 EST
From:      andy@shasta.stanford.edu (Andy Freeman)
To:        mod.protocols.tcp-ip
Subject:   Re: Those big, big books

In article <8612310227.AA03918@ucbvax.Berkeley.EDU> you write:
>							 And the publisher:
>should SRI be referred to as Stanford Research Institute, and should the
>name of the DDN be spelled out?
>
>Charles Bacon   (301)496-4823, or, Bldg. 12B, National Insts. of Health,
>                                   Bethesda, Md. 20892
>        or even CRB@NIHCUDEC.bitnet

SRI is not an acronym.  It is short for SRI International.  SRI
was spun-off from Stanford more than 15 years ago and one of the
conditions of the spin-off is that SRI can't use Stanford in its
name.

-andy
-- 
Andy Freeman
UUCP:  ...!decwrl!shasta!andy forwards to
ARPA:  andy@sushi.stanford.edu
(415) 329-1718/723-3088 home/cubicle

-----------[000183][next][prev][last][first]----------------------------------------------------
Date:      31 Dec 1986 16:59-EST
From:      CERF@A.ISI.EDU
To:        jbs@EDDIE.MIT.EDU
Cc:        tcp-ip%sri-nic@EDDIE.MIT.EDU, Feinler@SRI-NIC.ARPA
Subject:   Re: Becomming an Internet site

Jeff,

If the company seeks ARPANET access, it needs a government sponsor to
transfer $ to DCA for the access line and ARPANET use charges. The
sponsor has to supply documentation to say what the usage will be.

I believe the DDN-NIC (at SRI) can help you with specific documentation
required or pointer to an individual at DCA to whom such a request is
to be addressed. Try sending to Feinler@SRI-NIC.ARPA for further
information.

Vint Cerf
	
    Received: FROM SRI-NIC.ARPA BY USC-ISI.ARPA WITH TCP ; 30 Dec 86 23:15:38 EST
              from EDDIE.MIT.EDU by SRI-NIC.ARPA with TCP; Tue 30 Dec 86 18:44:46-PST
              from deep-thought.mit.edu by EDDIE.MIT.EDU (5.31/4.7) id AA15384; Tue, 30 Dec 86 21:16:31 EST
    Date: Tue 30 Dec 86 21:28:40-EST
    From: Jeff Siegal <@EDDIE.MIT.EDU:JBS@DEEP-THOUGHT.MIT.EDU>
    To: tcp-ip%sri-nic@EDDIE.MIT.EDU
    Subject: Becomming an Internet site
    Return-Path: <tcp-ip-RELAY@SRI-NIC.ARPA>
    Message-ID: <12267055545.8.JBS@DEEP-THOUGHT.MIT.EDU>
    
    How does one go about getting a net connection.  I'm talking here
    about a company I am familiar with which does data processing and
    software development for the US Gov't and would like to transfer data
    to and from gov't computers, and to communicate with gov't personnel
    via electronic mail.
    
    Please reply directly to jbs@eddie.mit.edu.
    
    Jeff Siegal
    -------
    
              --------------------
		
-----------[000184][next][prev][last][first]----------------------------------------------------
Date:      Thu, 1-Jan-87 00:40:14 EST
From:      @EDDIE.MIT.EDU:ROMKEY@XX.LCS.MIT.EDU (John Romkey)
To:        mod.protocols.tcp-ip
Subject:   Re: Becomming an Internet site

Suppose that a company wants an *Internet* connection, not an actual
arpanet connection (perhaps a nearby University or somesuch is willing
to allow them to connect)? All that's necessary then is just permission
from ARPA, not any $, right?
				- john
-------

-----------[000185][next][prev][last][first]----------------------------------------------------
Date:      Thu, 1-Jan-87 02:36:25 EST
From:      mills@huey.udel.edu.UUCP
To:        mod.protocols.tcp-ip
Subject:   Ask not for whom the chimes tinkle

Folks,

Every year it's the same - I forget UT midnight comes five hours before the
ball drops in Times Square. For an hour and sixteen minutes after the hoot
and holler in Trafalgar Square at least four radiofuzz timetellers still
squawked yesteryear. DCN1, UMD1, FORD1 and NCAR springs have now been
rewound to 1987 and all you guys can forget those whopping disk-usage
refunds. Thanks to Hans-Werner Braun, who reminded me of my annual first
duty of the new year and annual first resolution to figure out how to
avoid paw to keyboard in the absence throughout the world, as far I know,
of a highly reliable electronic way to find out what year it is.

Dave
-------

END OF DOCUMENT