|
|
ARCHIVE: TCP-IP Distribution List - Archives (1983)
DOCUMENT: TCP-IP Distribution List for November 1983 (87 messages, 50082 bytes)
SOURCE: http://securitydigest.org/exec/display?f=tcp-ip/archive/1983/11.txt&t=text/plain
NOTICE: securitydigest.org recognises the rights of all third-party works.
START OF DOCUMENT
-----------[000000][next][prev][last][first]---------------------------------------------------- Date: Tuesday, 1 Nov 1983 15:05-PST From: greep@SU-DSN To: Charles Hedrick <HEDRICK@RUTGERS> Cc: TCP-IP@SRI-NIC, Info-VAX@SRI-CSL Subject: Re: more Overtime for the Protocol Police
I don't see why a format error in the date field should break anyone's reply command, since the reply command only has to copy the contents of the field, not try to understand it (except maybe for removing leading blanks).
-----------[000001][next][prev][last][first]---------------------------------------------------- Date: Tuesday, 1 Nov 1983 16:14-PST From: greep@SU-DSN To: Charles Hedrick <HEDRICK@RUTGERS> Cc: tcp-ip@SRI-NIC, info-vax@SRI-CSL Subject: Re: more Overtime for the Protocol Police
Oh. Yes I notice your "in-reply-to" has the time in EST whereas my original was in PST. Actually I think that's more a bug than a feature, since it makes it that much harder for someone to correlate the times (unless his system also does that transformation), but I don't think it's important enough to spend much time discussing.
-----------[000002][next][prev][last][first]---------------------------------------------------- Date: 1 Nov 1983 1813 PST From: Eric P. Scott <EPS@JPL-VAX> To: TCP-IP@SRI-NIC Subject: "Same network" isn't necessary good enough for "jumbograms"
Some of us piggyback logical hosts on our 1822 ports... -=EPS=- ------
-----------[000003][next][prev][last][first]---------------------------------------------------- Date: 1 Nov 83 16:36:34 EST From: Charles Hedrick <HEDRICK@RUTGERS.ARPA> To: EPS@JPL-VAX.ARPA Cc: TCP-IP@SRI-NIC.ARPA, don.provan@CMU-CS-A.ARPA, Info-VAX@SRI-CSL.ARPA Subject: Re: more Overtime for the Protocol Police
Of course if we really want to get technical, almost everybody sends invalid date and times. The new RFC's do not use a hyphen between the time and the zone. Most sites use the hyphens in both DATE fields and the MAIL-RECEIVED timestamp. This could break REPLY commands, if software wanted to say re: Hedrick's message of 1-Jan-84 0:00 UT Of course we accept both the old AT and the old hyphen, but as long as we are cleaning things up, we might as well clean up everything. -------
-----------[000004][next][prev][last][first]---------------------------------------------------- Date: 1 Nov 83 18:44:50 EST From: Charles Hedrick <HEDRICK@RUTGERS.ARPA> To: greep@SU-DSN.ARPA Cc: tcp-ip@SRI-NIC.ARPA, info-vax@SRI-CSL.ARPA Subject: Re: more Overtime for the Protocol Police
I believe MM actually turns the date into internal date/time format and then puts it back out again. Note the in-reply-to on this message. The date is used for various purposes, such as asking to see all message since a certain date/time. For some of these purposes, we do actually have to understand the date. So it is reasonable to have the routine that parses headers turn the date/time into internal format. -------
-----------[000005][next][prev][last][first]---------------------------------------------------- Date: 2 Nov 1983 2109-PST From: Craig Milo Rogers <ROGERS@USC-ISIB> To: EPS@JPL-VAX, TCP-IP@SRI-NIC Subject: Re: "Same network" isn't necessary good enough for "jumbograms"
Logical hosts per se aren't pertinent. However, diversity of IP implementations is pertinent. If your host is connected only to a local network, and all the hosts on that net (except the gateway) are running the same IP software, then jumbograms are probably safe. (I'm assuming that a host is prepared to receive the largest jumbogram that it is prepared to send.) However, once you mix IP implementations on a net, the odds of some host not supporting jumbograms (or supporting sub-maximal jumbograms for the net) increase. On a net as diverse as the ARPANET, jumbograms aren't safe. So, what we really need are specific guidelines/standards for implementing IP/TCP on top of various networks. It would be nice if we could say, "As of Aug 1984 all ARPANET hosts will support IP packets up to the size limit imposed by the IMP subnet." At the moment, we can't. Craig Milo Rogers -------
-----------[000006][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 05:54-PST From: HAGAN@USC-ISID To: TCP-IP@SRI-NIC Cc: Foster@USC-ISID, Gorman@USC-ISID, Hagan@USC-ISID Subject: TCP-IP Mailing Lists
Bryan Gorman has joined SRI, as one of the new software types
on-site at the Fort Bragg Testbed. Could you please arrange to
put him on all of the pertinent mailing lists relative to TCP-IP
development and the notes to or about about the Protocol Police.
<GORMAN>@ISID
Regards,
Doug
-----------[000007][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 0923-PST From: Craig Milo Rogers <ROGERS@USC-ISIB> To: David C. Plummer <DCP@MIT-MC>, ROGERS@USC-ISIB Cc: TCP-IP@SRI-NIC, EPS@JPL-VAX Subject: Re: "Same network" isn't necessary good enough for "jumbograms"
Reply #1: Modularity IP deals with per-packet processing: routing, fragmenting, reassembly, etc. It has a notion of time (Time-to-live, reassembly time, ICMP "pinging" in some cases), but basically assumes that the packets it processes are unrelated to each other. TCP deals with sequences of packets in time. It retransmits data when unacknowleged, manages windows and window updates, calculates smoothed round trip times, and in some implementations probes the remote site to be assured that it is up. One of the most important properties of a local network (from a practical standpoint) are its delay, capacity, and throughput. The Internet is composed of networks with a wide range of parameters: 1200 bps land lines to 50 Mbit rings to geosynch satellite links with ~1/4 sec delays. A TCP implementation (usually) deals with this variation by selecting an initial set of retransmission parameters for a connection, and adjusting the parameters based on end-to-end round-trip-time. However, because of the wide dynamic range of the network properties mentioned above, the selection of the initial TCP parameters can be critical. The appropriate initial parameters for an Ethernet are different from those for a 1200 bps dial-up line. The effect of selecting the wrong initial parameters may be relatively benign, such as having unnecessary pauses during the start of a connection. On the other hand, the effect may be disasterous, such as flooding a slow-speed line with unnecessary retransmissions. So, it is very important that the TCP module get the proper set of initial parameters. In the general case, a path composed of arbitrary networks, it may not be possible to derive the "optimal" set, and a "conservative" set should be used. However, when it is possible to deduce (by comparing source and destination addresses) that the probable path is single network with well-known properties, then the "optimal" TCP parameters based on those properties may be selected. IP doesn't help here. There is no IP "estimated delay" or "estimated capacity" option. The IP specification doesn't mention maintaining a table of local network properties for use by higher-level protocols as one of the duties of an IP implementation. Perhaps it should. Certainly it can, in individual implementations. In fact, a good implementation might provide a common database of estimated path properties which may be accessed by TCP, TFTP, and any other potentially interested party. So, that's why I uttered "TCP" and "networks" in the same breath. Modularity is in the eye of the beholder. Reply #2: On Mountains and Valleys Imagine a collection of people who live in rough terrain. Widely scattered, they are lonely for each other's company (or perhaps they just want to trade). Some people tried climing the mountains to get to their neighbors, while others used the longer but smoother routes through the valleys below. However, the knowlege of how to get from one point to another wasn't widely known, and not all routes were safe. Many people were killed by avalanches, while others wandered into swamps in the valleys. Eventually a group of people got to gether to set up a system of roads. It seemed to make sense to put the roads in the valleys, because some people just didn't have the extra energy it takes to climb mountains, and its far easier to detour around all the swamps than it is to build avalanche shelters in all the passes (to assure year-round passage). So, the roads were build in the valleys, and signs were posted at the intersections, so you could go from any place to any other place based on its address. However, it takes a long time to wander about in those twisty valleys. Perhaps you have an urgent need to visit your neighbor, and there's only one mountain inbetween, and you are full of energy and have a good sense of direction. The mountains are pretty safe in the summertime, so why not take the shortcut? Let's not make it illegal to hike in the mountains. Instead, lets publish guidelines that people can use to estimate whether they are ready for the trip. Craig Milo Rogers -------
-----------[000008][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 11:22:24 PST From: POSTEL@USC-ISIF To: TCP-IP@SRI-NIC Subject: Specs for "How to do IP on Net of Type X"
Hi: I would very much appreciate receiving draft RFCs on "How do/implement/use IP on networks of type X", where X is Ethernet, ARPANET, or anything. There is only one such memo (you could use it as a model), RFC 877 on doing IP on Public Data Nets. --jon. -------
-----------[000009][next][prev][last][first]---------------------------------------------------- Date: Thursday, 3 November 1983 09:45 est From: DClark@MIT-MULTICS.ARPA To: JFisher.Help@USGS1-MULTICS.ARPA Cc: TCP-IP@SRI-NIC.ARPA Subject: Re: TCP/IP & PR1ME
Folks,
For info on protocols fro Prime, you might call Dave Jacobs at
Prime. His number is 617-879-2960 x4113.
Dave
-----------[000010][next][prev][last][first]---------------------------------------------------- Date: Thu, 3 Nov 83 13:13:18 PST From: Rich Wales <v.wales@UCLA-LOCUS> To: TCP-IP@SRI-NIC Subject: Query about "logical" ARPANET hosts
I would like more information about the third ("logical host") byte in
ARPANET addresses. Specifically, a few people seem to be using this
byte either to give multiple identities to a single host (as with Rut-
gers' RU-GREEN and RU-BLUE pseudo-hosts) or to allow additional hosts
to appear to be on the ARPANET via a transparent gateway (as with the
hosts sharing a port with SRI-C3P0 and -- apparently -- ISI-PNG11).
On the surface, interfacing a class-C local network to the Internet by
using the third byte of the ARPANET address as a local-network-host
specifier -- together with some special "smarts" in the host that is
actually, physically, attached to the IMP -- seems to me to be a much
more reliable approach than publicizing your local net's set of addres-
ses, advertising a gateway host, and hoping that people will eventually
update their routing tables accordingly.
However, I have been unable to find any RFC which says anything about
legitimate uses of the "logical host" byte. Can I legally do anything
with this byte? If so, whose permission do I need?
Also, is the third byte going to be more-or-less permanently available
for "logical host" purposes? In particular, when either the ARPANET or
the MILNET grows to more than 256 IMP's, is the third byte going to be
used as an extension of the IMP number? (I would hope not -- especial-
ly since a couple of bits could seemingly be taken from the high-order
end of the second or "physical port" byte.)
-- Rich Wales <wales@UCLA-LOCUS>
-----------[000011][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 13:24-PST From: William "Chops" Westfield <BillW @ SRI-KL> To: tcp-ip@NIC Subject: I dont understand...
Isnt IP supposed to take care of fragmenting packets so that TCP doesnt have to worry about such things ? BillW
-----------[000012][next][prev][last][first]---------------------------------------------------- Date: Thu, 3 Nov 83 10:43 EST From: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> To: tcp-ip@SRI-NIC.ARPA, eps@JPL-VAX.ARPA Subject: jumbograms
Date: 2 Nov 1983 2109-PST
From: Craig Milo Rogers <ROGERS@USC-ISIB>
Logical hosts per se aren't pertinent. However, diversity of
IP implementations is pertinent. If your host is connected only to
a local network, and all the hosts on that net (except the gateway) are
running the same IP software, then jumbograms are probably safe. (I'm
assuming that a host is prepared to receive the largest jumbogram
that it is prepared to send.) However, once you mix IP implementations
on a net, the odds of some host not supporting jumbograms (or supporting
sub-maximal jumbograms for the net) increase. On a net as diverse as
the ARPANET, jumbograms aren't safe.
So, what we really need are specific guidelines/standards for
implementing IP/TCP on top of various networks. It would be nice if
we could say, "As of Aug 1984 all ARPANET hosts will support IP
packets up to the size limit imposed by the IMP subnet." At the
moment, we can't.
Lets try, though. In particular, I would like it to be adopted as part
of the IP standard on embedding IP in specific transport media what the
maximum IP datagram size on that medium is. If we expect different
implementations to work together, we NEED this. I suggest,
IMP subnet 1008 bytes
10MB Ethernet 1500 bytes
Suggestions for other media are welcome. This will at least give us
something to work towards.
-----------[000013][next][prev][last][first]---------------------------------------------------- Date: 3 November 1983 10:49 EST From: David C. Plummer <DCP @ MIT-MC> To: ROGERS @ USC-ISIB Cc: TCP-IP @ SRI-NIC, EPS @ JPL-VAX Subject: Re: "Same network" isn't necessary good enough for "jumbograms"
Date: 2 Nov 1983 2109-PST
From: Craig Milo Rogers <ROGERS@USC-ISIB>
So, what we really need are specific guidelines/standards for
implementing IP/TCP on top of various networks. It would be nice if
we could say, "As of Aug 1984 all ARPANET hosts will support IP
packets up to the size limit imposed by the IMP subnet." At the
moment, we can't.
I'm sorry, but I have to disagree. You are suggesting the
ARPANET revert to a special network instead of part of a (little
i) internet, namely, the (big I) Internet. The whole purpose of
Internet was for all hosts to see the world as a uniform
collection of hosts.
There is some validity to what you said, but I don't feel your
reasons were quite right.
By modularity, TCP should know nothing about the local network or
networks that are carrying the packets. There are various
interactions with IP that are intended to avoid fragmentation.
This breaks modularity but exists for practicality. Therefore,
you should not have said TCP in the paragraph I included.
There IS a guideline for packet sizes of IP packets on a given
subnet. "Management" determines the maximum packet size, which
could be affected by hardware limitations (e.g. 512 byte ram
buffers), specification restrictions (e.g. 1500 byte on
Ethernets) or practicality. For the ARPANET "management" currently
says 576 bytes. For an Ethernet per/site "management" could say
1500 (and make sure its gateways do fragmentation to the outside
world) or 576 bytes to avoid the fragmentation problem as much as
possible.
-----------[000014][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 10:57:41 EST (Thursday) From: Andrew Malis <malis@BBN-UNIX> To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: tcp-ip@SRI-NIC.ARPA, eps@JPL-VAX.ARPA, malis@BBN-UNIX Subject: Re: jumbograms
Charles, The limit for the IMP subnet is 1007 bytes, not 1008. The 1822 specification limits the maximum message size, not including the 1822 leader, to 8063 bits. Since IP datagrams must be an integral number of bytes in length, this restricts the maximum datagram length to 1007 bytes. Andy
-----------[000015][next][prev][last][first]---------------------------------------------------- Date: Thursday, 3 November 1983 12:45 est From: JSLove@MIT-MULTICS.ARPA (J. Spencer Love) To: David C. Plummer <DCP@MIT-MC.ARPA> Cc: ROGERS@USC-ISIB.ARPA, TCP-IP@SRI-NIC.ARPA, EPS@JPL-VAX.ARPA Subject: Re: "Same network" isn't necessary good enough for, "jumbograms"
Date: 3 November 1983 10:49 est
From: David C. Plummer <DCP at MIT-MC>
Subject: Re: "Same network" isn't necessary good enough for
"jumbograms"
Date: 2 Nov 1983 2109-PST
From: Craig Milo Rogers <ROGERS@USC-ISIB>
So, what we really need are specific guidelines/standards for
implementing IP/TCP on top of various networks.
There IS a guideline for packet sizes of IP packets on a given
subnet. "Management" determines the maximum packet size, which
could be affected by hardware limitations (e.g. 512 byte ram
buffers), specification restrictions (e.g. 1500 byte on
Ethernets) or practicality. For the ARPANET "management" currently
says 576 bytes. For an Ethernet per/site "management" could say
1500 (and make sure its gateways do fragmentation to the outside
world) or 576 bytes to avoid the fragmentation problem as much as
possible.
"Management" currently says that the default packet size is 576 bytes,
for the whole Internet, in the absence of any other information. That
doesn't mean that you can't send bigger datagrams. The hard limit for
the arpanet is about 1005 bytes. The hard limit for some other networks
is less than 576.
There is no problem with ethernets which can process 1500 byte packets.
Let each host send the TCP packet size option when setting up the
connection, specifying 1460 (data) byte packets. If either side does
not send such an option, then let it be assumed that they specified 536.
Each host should then format its packets taking the min of two numbers:
the packet size it offers, and the packet size the other side offers.
The packet size it offers is presumably based on the hardware limit as
well as any limits having to due with buffer allocation (which are
better handled using window size).
Thus, if MIT-MULTICS sends a TCP SYN to CMU-CS-A packet offering a max
packet size of 1005, and CMU-CS-A sends back a packet from behind its
gateway specifying a max packet size of 300, then 300 bytes is the max
packet size in both directions. This solves the problem for all
same-net connections between hosts that implement the TCP packet size
option, and makes a start on the other cases.
If two networks permitting jumbograms are connected by a third network
which doesn't, then this simplistic approach will fail. In this case,
the IP layer must be consulted to find out if the max packet size is
1005, 576, or some other number. If the IP layer doesn't know, then it
should tell the TCP layer that 576 is the limit (in this sense, the
preceding example was contrived, since IP probably wouldn't know that
CMU-CS-A was only two hops away).
As TCP/IP is currently defined, networks that can't accept 576 byte
datagrams are in violation of the standard (albeit in a minor way) and
thus should implement the TCP max packet size option to keep the rest of
the world from exercising their (possibly nonexistent) fragment
reassembly algorithms.
-----------[000016][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 15:49:09 EST (Thursday) From: Jack Haverty <haverty@BBN-UNIX> To: JSLove@MIT-MULTICS.ARPA (J. Spencer Love) Cc: David C. Plummer <DCP@MIT-MC.ARPA>, ROGERS@USC-ISIB.ARPA, TCP-IP@SRI-NIC.ARPA, EPS@JPL-VAX.ARPA Subject: Re: "Same network" isn't necessary good enough for, "jumbograms"
I think: 1/ a gateway or host which doesn't implement fragmentation or reassembly respectively is simply not conforming to the specification. Those functions are not optional. 2/ Given #1, all a host need do is: a/ not try to send anything to its attached network bigger than that network's max packet size; b/ not try to send anything bigger than 576 (or whatever that number is exactly), without confirming that the receiver will accept bigger things based on a TCP negotiation c/ not try to send anything bigger than some number < 576 which has been obtained via the TCP option from the other end. 3/ There is no need to couple the send and receive parameters. 4/ Gateways must be able to accept packets from their attached networks which are the maximum packet size for that network technology (although for some 'technologies' like raw wires that doesn't work). Nets with snmaller max sizes than 576 are perfectly legal. Jack
-----------[000017][next][prev][last][first]---------------------------------------------------- Date: 3 November 1983 18:38 EST From: David C. Plummer <DCP @ MIT-MC> To: JSLove @ MIT-MULTICS, BILLW @ SRI-KL, Haverty @ BBN-UNIX, ROGERS @ USC-ISIB Cc: tcp-ip @ SRI-NIC Subject: What is modular, anyway?
Sorry, the primary recipients got this twice.
------------------------------
Date: 3 Nov 1983 13:24-PST
From: William "Chops" Westfield <BillW @ SRI-KL>
Isnt IP supposed to take care of fragmenting packets so that TCP
doesnt have to worry about such things ?
Yes. See more below.
------------------------------
Date: 3 Nov 1983 0923-PST
From: Craig Milo Rogers <ROGERS@USC-ISIB>
Reply #1: Modularity
I agree with nearly everything you said here. Your previous
messages could have easily led to gross modularity violations if
the reader was not careful. (I'm paranoid that there are several
non-careful readers.) What you said, I think, is that for a
protocol implementation (e.g. TCP) to be practical over a
transport layer (e.g. IP), communication may be desirable (maybe
even necessary) between the two layers. Communication is FAR
different than assumptions. Assumptions are not modular,
communication is. We may actually be in 99% agreement on what
"modularity" is when appllied to the IP/TCP world.
Reply #2: On Mountains and Valleys
Amusing. I think I know what you were saying, but it doesn't
propose any solutions.
------------------------------
Date: 3 Nov 1983 15:49:09 EST (Thursday)
From: Jack Haverty <haverty@BBN-UNIX>
1/ a gateway or host which doesn't implement fragmentation or reassembly
respectively is simply not conforming to the specification. Those functions
are not optional.
Correct!!
2/ Given #1, all a host need do is:
a/ not try to send anything to its attached network bigger than
that network's max packet size;
No!
b/ not try to send anything bigger than 576 (or whatever that number
is exactly), without confirming that the receiver will accept bigger
things based on a TCP negotiation
No!!
c/ not try to send anything bigger than some number < 576 which has
been obtained via the TCP option from the other end.
No!!! You are confusing TCP with IP which is what Craig an I are
having a little discussion about. Your point 1 deals with IP,
which has nothing to do with the issues of point 2, which deals
with TCP. It is perfectly valid for my implementation of TCP (on
both sides) to send a max segment size of 5000 bytes. It is
further the responsibility of my IP implementation to notice if a
packet this large cannot fit over the local network. If it
cannot fit, it must perform fragmentation. The foreign host
assembles the IP fragments, sends the packet up to TCP which then
handles the 5000 byte TCP segment. This is completely valid,
probably inefficient, probably impractical, and REQUIRES NO
KNOWLEDGE of local network packet sizes or medium types (it could
be 1200 baud). It does not require the information, but
communication with the transport layer (IP) would greatly improve
efficiency.
4/ Gateways must be able to accept packets from their attached networks
which are the maximum packet size for that network technology (although
for some 'technologies' like raw wires that doesn't work). Nets with
snmaller max sizes than 576 are perfectly legal.
Again, NO! Maximum packet size on a particular (sub)network is
determined by management. This actually has nothing to do with
transport layers (e.g. IP)! The implementation of interfaces
should be sufficiently flexible to be able to tell each transport
layer (e.g. IP) that asks it what the maximum segment size is (a
"site variable" determined by management) for the particular
(sub)net to which the interface is attached. IP would ask the
interfaces this in order to determine when it must fragment over
the various interfaces. It must also be prepared to accept each
of these sizes over each of the interfaces. Therefore, it can
take the MIN for packets being transmitted, but it must be
prepared to receive the differing size maximums.
Note a few things. Ethernets at different sites could have
different IP max packet sizes. Different Ethernets at the same
site could have different IP max packet sizes!! This could
happen if the site manager puts small address space machines
which have little buffering capacity (e.g. PDP-11s) on one
Ethernet and large address space machines (just about everything
else) on the other Ethernet. It is important that the IP max
packet size for a network be a property of the INTERFACE, not the
medium type. This is nearly imperative for different
implementation to be able to use large packets on a medium that
can handle them. Management sets the numbers. When a protocol
is installed on a machine, management tells the installer what
the local site parameters are. The installer sets the
appropriate variables and constants in the interfaces, a process
which is dependent upon the implementation of the interfaces and
operating system.
------------------------------
Date: Thursday, 3 November 1983 12:45 est
From: JSLove@MIT-MULTICS.ARPA (J. Spencer Love)
There is no problem with ethernets which can process 1500 byte packets.
Let each host send the TCP packet size option when setting up the
connection, specifying 1460 (data) byte packets. If either side does
not send such an option, then let it be assumed that they specified 536.
Each host should then format its packets taking the min of two numbers:
the packet size it offers, and the packet size the other side offers.
The packet size it offers is presumably based on the hardware limit as
well as any limits having to due with buffer allocation (which are
better handled using window size).
Right idea, wrong modularity. I think it should go something
like this: (Note that IP should really be "the transport layer"
and TCP should be "the protocol layer", but I leave it as IP and
TCP for the purpose of example)
IP asks the interfaces what the max IP packet size is for
the interfaces.
IP slowly gathers routing information.
TCP tells IP that it wants to talk to host FOO
IP /guesses/ the route to host FOO and caches this.
TCP asks IP how big a packet/segment it should deliver to IP
when talking to FOO before efficiency (e.g.
fragmentation) would degrade service.
IP returns a value to TCP based on the guessed route. Unless
the route is highly dynamic, the guess will usually be
right and accidental fragmentation will not occur. If
you really want to get hairy (and depending on the
implementation) TCP always asks IP for the packet to use,
and IP only lets TCP use as many bytes as the currently
determined route should use.
TCP mins this with the number of bytes it is most comfortable
receiving, and uses this as the max segment size. If it
wishes, it can further min this with the max segment size
of the foreign implementation, since it may know
something about the route that you don't.
-----------[000018][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 1919-EST (Thursday) From: Christopher A Kent <cak@PURDUE.ARPA> To: Rich Wales <v.wales@UCLA-LOCUS.ARPA> Cc: TCP-IP@SRI-NIC.ARPA Subject: Re: Query about "logical" ARPANET hosts
When we first got our TCP up, and it was clear that we needed some extended addressing for our proNET, I thought about this, too; before making such a (for me) momentous decision, I, too, went to others for their opinions. When I suggested it to Rob Gurwitz, I recall that he said something to the effect of "It's a kludge, and should go away. Don't even think it." The "vox Internet" was largely in the same vein. Because of this, I built in-host gateway code for the BBN implementation of TCP/IP for the VAX. It would seem that the uses of the logical host octet are largely undocumented; the only one that I am sure of is the SRI Port Expander black box, which provides some of the "extra smarts" you allude to. I have heard mutterings about these boxes, too -- to the effect of "they cause more troubles than they solve" (admittedly secondhand information). I personally feel that going the gateway route is much more in the spirit of the Internet concept; I also feel it is more flexible. We ran for about six months using my special code in one of our Vaxen as gateway to our class C net; then the use of that machine increased, as did the traffic on our net, so we dedicated an 11/34 to do gatewaying duties, using off-the-shelf code. We also realized that we were going to need more address space than a class C network provided, so we switched to a class B number. Had I been running logical host decoding, I would have been totally at sea; as it was, the effort was quite small. In short, I now agree with Rob's assessment. It might seem like a way to do it, but in the long run, doing it the "right" way pays off. Cheers, chris ----------
-----------[000019][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 1983 19:37:55 EST (Thursday) From: Mike Brescia <brescia@BBN-UNIX> To: Rich Wales <v.wales@UCLA-LOCUS> Cc: TCP-ip@nic, brescia@BBN-UNIX Subject: Re: Query about "logical" ARPANET hosts
IEN-115, 'Address Mappings' dated August '79 outlines the alignment between local net addresses and the 'rest' part of the IP address (that part beyond the 'net' field). I don't have the 'Transition Notebook' with me, but I think that section is included. Nets which have smaller local net address fields than the 3 or 2 bytes in class A or B net addresses have an inherent logical address capability. There was in the past a statement by a usually reliable source that the ARPANET would not expand beyond 255 imps, so the 'third byte' is used by having the routine which maps the IP address (rest) into the arpanet address set the high order imp byte to zero. This code is in the core gateways, among (many) other implementations. Mike Brescia
-----------[000020][next][prev][last][first]---------------------------------------------------- Date: 3 November 1983 21:34 EST From: David C. Plummer <DCP @ MIT-MC> To: brescia @ BBN-UNIX Cc: TCP-ip @ SRI-NIC, v.wales @ UCLA-LOCUS Subject: Re: Query about "logical" ARPANET hosts
Date: 3 Nov 1983 19:37:55 EST (Thursday)
From: Mike Brescia <brescia@BBN-UNIX>
IEN-115, 'Address Mappings' dated August '79 outlines the alignment
between local net addresses and the 'rest' part of the IP address (that
part beyond the 'net' field). I don't have the 'Transition Notebook'
with me, but I think that section is included.
August '79 was before my dealings with networks (any type), but
I'll bet there wasn't even such a thing as class B or C
addresses. Therefore, I wouldn't trust ANYTHING from that era
which mentions host address formats.
-----------[000021][next][prev][last][first]---------------------------------------------------- Date: 3 Nov 83 22:37:02 EST From: Charles Hedrick <HEDRICK@RUTGERS.ARPA> To: v.wales@UCLA-LOCUS.ARPA Cc: TCP-IP@SRI-NIC.ARPA Subject: Re: Query about "logical" ARPANET hosts
Please note that Rutgers does not use the logical host number for gatewaying. We use it strictly to simplify the relaying of incoming mail. I point this out because we have not asked DCA to authorized us to have a gateway, and I would not want anyone to think we are sneaking around the requirement for approval. However the use suggested by Wales had occured to me, and it sounded like a good idea. -------
-----------[000022][next][prev][last][first]---------------------------------------------------- Date: Fri, 4 Nov 83 11:40 EST From: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> To: TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, networks%SCRC-TENEX@MIT-MC.ARPA Subject: IP on Ethernet draft RFC
(Here is a draft RFC for IP in Ethernet. Comments?)
Charles Hornig
Symbolics Cambridge Research Center
November 1983
A Standard for the Transmission of IP Datagrams Over Ethernet Networks
This RFC specifies a standard method of encapsulating Internet Datagram
Protocol (IP)[1] datagrams on the Ethernet[2].
Frame Format
IP packets are transmitted in standard Ethernet frames. The type field
of the Ethernet frame must contain the value X'0800'. The data field
contains the IP header followed immediately by the IP data. If
necessary, the data field may be padded to meet the Ethernet minimum
frame size. This padding is not part of the IP packet and is not
included in the total length stored in the IP header.
The maximum length of an IP packet sent over an Ethernet is 1500 octets.
Implementations are encouraged to support full-length packets. Gateways
MUST be prepared to accept full-length packets and fragment them if
necessary. If a system cannot receive full-length packets, it should
take steps to discourage others from sending them, such as using the TCP
Maximum Segment Size option.
Note: Packets on the Ethernet may be longer than the general Internet
default maximum packet size of 576 octets. Hosts connected to an
Ethernet should keep this in mind when sending packets to hosts not on
the same Ethernet. It may be appropriate to send smaller packets to
avoid unnecessary fragmentation at intermediate gateways.
Address Mappings
Mappings between 32-bit Internet addresses and 48-bit Ethernet addresses
are accomplished through the Address Resolution Protocol[3]. Internet
addresses are assigned arbitrarily on some class B or C network. Each
host's implementation must know its own Internet address and respond to
Ethernet Address Resolution packets appropriately. It should also use
the protocol to translate Internet addresses to Ethernet addresses when
needed.
Trailer Formats
Some versions of Unix 4.2bsd use a different encapsulation method in
order to get better network performance with the VAX virtual memory
architecture. Consenting systems on the same Ethernet may use this
format between themselves. No host is required to implement it, and no
datagrams in this format should be sent to any host unless the sender
has positive knowledge that the recipient will be able to interpret
them.
References
[1] Information Sciences Institute, Internet Protocol. ARPA Network
Information Center RFC 791. September 1981.
[2] Digital Equipment Corporation, Intel Corporation, Xerox Corporation.
The Ethernet. Version 1.0. September 30, 1980.
[3] Plummer, David C., An Ethernet Address Resolution Protocol. ARPA
Network Information Center RFC 826. November 1982.
-----------[000023][next][prev][last][first]---------------------------------------------------- Date: Friday, 4 November 1983 12:42:30 EST From: Mike.Accetta@CMU-CS-IUS To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, networks%SCRC-TENEX@MIT-MC.ARPA Subject: Re: IP on Ethernet draft RFC
Charles, Your draft agrees with the standards which have been adopted for use at CMU (i.e. the "straight-forward" encapsulation and use of ARP). I'm curious as to why Class A networks are explicitly excluded from the address mapping, though. - Mike
-----------[000024][next][prev][last][first]---------------------------------------------------- Date: Fri, 4 Nov 83 13:04 EST From: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> To: Mike.Accetta@CMU-CS-IUS.ARPA Cc: TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA Subject: Re: IP on Ethernet draft RFC
Date: Friday, 4 November 1983 12:42:30 EST
From: Mike.Accetta@CMU-CS-IUS
Your draft agrees with the standards which have been adopted for use
at CMU (i.e. the "straight-forward" encapsulation and use of ARP). I'm
curious as to why Class A networks are explicitly excluded from the
address mapping, though.
Since there can only be 1024 interfaces on an Ethernet, I didn't see the
need to ever use a class A network number for one. If someone has a
legitimate application, I have no objection.
-----------[000025][next][prev][last][first]---------------------------------------------------- Date: Friday, 4 November 1983, 13:04-EST From: "David C. Plummer" <DCP%SCRC-TENEX at MIT-MC.ARPA> To: Hornig%SCRC-QUABBIN at MIT-MC.ARPA, TCP-IP at SRI-NIC, Postel at USC-ISIF, networks%SCRC-TENEX at MIT-MC.ARPA Subject: IP on Ethernet draft RFC
What this boils down to is a choice between the following two options:
(1) Define one global maximum for a medium (1500 IP bytes for the
Ethernet), or
(2) Allow per cable maximums, set by the manager of the medium.
An advantage for each:
(1) Uniform IP implementations on the Ethernet.
(2) Allows a manager to set a smaller limit for the benifit of small
address space machines (e.g., IBM PC, PDP-11s). The IBM PC,
however, will likely limit the IP packet size by limiting the TCP
segment size.
A disadvantage for each:
(1) Requires all gateways to fragment (I know of at least one that
doesn't).
(2) Implementators must make setting the maximum an easy part of site
installation. Code modulatity may not be quite right to easily
allow this.
Personally, I think (2) is the *right thing*, but would probably agree
that (1) is easier to implement and more practical.
Note that similar RFCs are going to be needed for ARPANET, proNET, etc.
-----------[000026][next][prev][last][first]---------------------------------------------------- Date: Friday, 4 November 1983 16:58:51 EST From: Mike.Accetta@CMU-CS-IUS To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: TCP-IP@SRI-NIC.ARPA Subject: Re: IP on Ethernet draft RFC
Charles, One possible reason for not arbitrarily disallowing class A networks is to permit logical Class A IP networks to be constructed from more than one cable (some of which may not even be ethernets). In our case for example, addresses on CMU's Class B IP network actually span around 7 different physical cables including two 3Mb and three 10Mb ethernets among others. - Mike
-----------[000027][next][prev][last][first]---------------------------------------------------- Date: 5 Nov 1983 1532-EST (Saturday) From: Christopher A Kent <cak@PURDUE.ARPA> To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, networks%SCRC-TENEX@MIT-MC.ARPA Subject: Re: IP on Ethernet draft RFC
It would be nice if you could document the trailer encapsulation that Berkeley provides. There are going to be a lot of 4.2 systems out there soon, and there might even be manufacturers that want to be compatible with it. Mike Karels <karels@ucbarpa> or Sam Leffler <sam@ucbarpa> should be able to help out with this. Does anyone else feel the need for an encapsulation negotiation protocol, so that decisions about whether or not to use trailers (for example) can be handled well? How is one supposed to get the "positive knowledge" that trailers are supported? I've been wondering if ARP responses couldn't be extended to say "my address is foo, and, by the way, I speak trailers". Comments? Cheers, chris ----------
-----------[000028][next][prev][last][first]---------------------------------------------------- Date: Monday, 7 Nov 1983 09:09-PST From: obrien@rand-unix To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC> Cc: Rob Gurwitz <gurwitz@BBN-VAX>, Christopher A Kent <cak@PURDUE>, TCP-IP@SRI-NIC, Postel@USC-ISIF Subject: Re: IP on Ethernet draft RFC
I had understood that the performance gain was more than moderate; was, in fact, measured in orders of magnitude or something. Clearly there's a tradeoff. Just how much does trailer protocol buy you?
-----------[000029][next][prev][last][first]---------------------------------------------------- Date: Mon 7 Nov 83 09:28:00-PST From: Mathis@SRI-KL.ARPA To: Hornig%SCRC-QUABBIN@MIT-MC.ARPA Cc: TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, networks%SCRC-TENEX@MIT-MC.ARPA, Mathis@SRI-KL.ARPA Subject: Re: IP on Ethernet draft RFC
The "how to attach to net X" document also needs to same something about how a host "finds" gateways (also needs to talk about "finding" a name server but that is a separate mailing list). In the case of Ethernet, how is this done? For example, the implicit rules for attaching to the ARPANET requires hosts to maintain a partial table of gateway addresses since the ARPANET itself will currently give no help to a host in trying to find a gateway. In the PRNET, gateways are always assigned to logical addresses B or C. For Ethernet, ARP could be used to not only translate local addresses into 48 bit Ethernet address but also non-local addresses into 48 bit Ethernet addresses that just happen to be gateways. Do gateways currently do this? -------
-----------[000030][next][prev][last][first]---------------------------------------------------- Date: Mon, 7 Nov 83 9:23:27 EST From: Rob Gurwitz <gurwitz@bbn-vax> To: Christopher A Kent <cak@PURDUE.ARPA> Cc: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, networks%SCRC-TENEX@MIT-MC.ARPA Subject: Re: IP on Ethernet draft RFC
Yes. Trailers are a kludge and a loss and even some from Xerox admit it. I think there's an IFDEF that turns them off.
-----------[000031][next][prev][last][first]---------------------------------------------------- Date: Mon, 7 Nov 83 10:36 EST From: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> To: karels@ucbarpa, sam@ucbarpa Cc: Christopher A Kent <cak@PURDUE.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA Subject: Re: IP on Ethernet draft RFC
From: Christopher A Kent <cak@PURDUE.ARPA>
Date: 5 Nov 1983 1532-EST (Saturday)
<831104114041.6.Hornig@QUABBIN.SCRC.Symbolics>
It would be nice if you could document the trailer encapsulation that
Berkeley provides. There are going to be a lot of 4.2 systems out there
soon, and there might even be manufacturers that want to be compatible
with it. Mike Karels <karels@ucbarpa> or Sam Leffler <sam@ucbarpa>
should be able to help out with this.
Does anyone else feel the need for an encapsulation negotiation
protocol, so that decisions about whether or not to use trailers (for
example) can be handled well? How is one supposed to get the "positive
knowledge" that trailers are supported? I've been wondering if ARP
responses couldn't be extended to say "my address is foo, and, by the
way, I speak trailers". Comments?
Cheers,
chris
----------
Could you help out with this? I need a formal specification of the
trailer format.
-----------[000032][next][prev][last][first]---------------------------------------------------- Date: Mon, 7 Nov 83 10:40 EST From: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> To: Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, Christopher A Kent <cak@PURDUE.ARPA> Cc: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA Subject: Re: IP on Ethernet draft RFC
Date: Mon, 7 Nov 83 9:23:27 EST
From: Rob Gurwitz <gurwitz@bbn-vax>
Yes. Trailers are a kludge and a loss and even some from Xerox
admit it. I think there's an IFDEF that turns them off.
I think that it is probably a good idea to document trailers in an
appendix to the document. I also think, though, that their presence
adds a lot of gratuitous complexity to the standard for a moderate
performance gain on only one particular network implementation. How
strongly should we discourage their use?
-----------[000033][next][prev][last][first]---------------------------------------------------- Date: Mon, 7 Nov 83 10:46 EST From: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> To: Mike.Accetta@CMU-CS-IUS.ARPA Cc: TCP-IP@SRI-NIC.ARPA Subject: Re: IP on Ethernet draft RFC
Date: Friday, 4 November 1983 16:58:51 EST
From: Mike.Accetta@CMU-CS-IUS
Charles,
One possible reason for not arbitrarily disallowing class A networks
is to permit logical Class A IP networks to be constructed from more
than one cable (some of which may not even be ethernets). In our case
for example, addresses on CMU's Class B IP network actually span around
7 different physical cables including two 3Mb and three 10Mb ethernets
among others.
Agreed. Change made.
-----------[000034][next][prev][last][first]---------------------------------------------------- Date: Mon, 7 Nov 83 11:04:35 EST From: Rob Gurwitz <gurwitz@bbn-vax> To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, Christopher A Kent <cak@PURDUE.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA Subject: Re: IP on Ethernet draft RFC
I think they should be strongly discouraged. As you point out, they are of moderate benefit in a very narrow environment and probably cost more in implementation complexity (both inside and between hosts, i.e. inventing an option negotiation) than they are worth. Now I'm sure there are people who would argue the other side vehemently, but if you're talking about trying to set a standard for people to conform to it's appropriate to take strong stands. On another issue, I take exception to some of the strong modularity issues that have been discussed recently. They seem to imply that performance should be sacrificed for some abstract idea of modularity. I am more of a pragmatist than that. Sure, we should have well defined boundaries and paramaterized implementations for flexibility. On the other hand, the good implementer will look for ways of taking advantage of all the information he can glean from various levels. That may mean that TCP can find out about what the appropriate segment sizes it can send on a net that avoids IP fragmentation. I see nothing wrong with it as long as it is not made specific to one net, internet, or implementation. After all, IP when making a routing "guess" is taking advantage of information it gets or infers from the local net layer.
-----------[000035][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 11:06:57 EST (Monday) From: Jonathan Dreyer <jdreyer@BBN-UNIX> To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, Christopher A Kent <cak@PURDUE.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA Subject: Re: IP on Ethernet draft RFC
Exceptions to a standard almost never help anyone. If some implementations want to violate the standard and keep to themselves, that's their business, but I see no reason to sanction this in the spec.
-----------[000036][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 1511-PST (Monday) From: mo@LBL-CSAM (Mike O'Dell[Group-L]) To: obrien@rand-unix Cc: gurwitz@BBN-VAX, cak@PURDUE, TCP-IP@SRI-NIC, Postel@USC-ISIF, Hornig%SCRC-QUABBIN@MIT-MC Subject: Re: IP on Ethernet draft RFC
The improvement is routinely about 20-30% according to measurements we did. In the future, on systems with tear-away reads and writes, trailers make it possible to get packets from user address space to user address space with the only data copies those done by the DMA hardware. This is a decidedly non-trivial gain. -Mike
-----------[000037][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 15:47-PST From: William "Chops" Westfield <BillW @ SRI-KL> To: tcp-ip@NIC Subject: modularity vs efficiency in fragmentation.
Hmm. I did think of this. however, thinking abot it further, it isnt inherently obvious that fragmenting a large piece of data at IP level isnt MORE efficient than fragmenting it at TCP level. After all, an IP packet is a simpler thing, and in general requires less effort on the part of the OS than a TCP packet. This assumes that most packets are not required to be retransmitted. If the packet does end up needing to be retransmitted, then the relative efficiencey of fragmentation at TCP level goes up, since less data is likely to need retransmitting. Comments? Bill W
-----------[000038][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 1305-EST (Monday) From: Christopher A Kent <cak@PURDUE.ARPA> To: obrien@rand-unix.ARPA Cc: Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, Christopher A Kent <cak@PURDUE.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Subject: Re: IP on Ethernet draft RFC
I think Mike has got the right ballpark. I have become convinced that trailers are not just a kludge; designing with them in mind can simplify an implementation, rather than making it more complex. The problem is that in an Internet environment, you can't necessarily take full advantage. As I understand it, the two advantages of trailer encapsulation are: - you can do most of your i/o out of page-aligned buffers, which makes the dma much more efficient - on input, you don't have to save header state until you know you have a fully undamaged input packet. That is, you can collect data buffers until they're all in, and only then do you need to interpret the header, which is precisely when it shows up. If you run out of buffers, or the packet is short, or whatever, you can just stop listening, and not have to worry about undoing whatever you did to keep the header state around. Since we (the Internet community) have to implement both trailer and non-trailer encapsulations, and the non-trailer is most straightforward, the trailer encapsulation is often viewed as "a bag on the side" of the driver. The trailer encapsulation need not be restricted to Ethernets; the encapsulation also exists, for example, on proNET 10Mb ring nets. I don't think we should regard this as a deviation from the standard (what standard, <jdreyer@bbn>?); we're trying to define the standard right now, so there is no such thing. It's a different way of looking at things, and shouldn't be tossed out without careful consideration. 4.2bsd is going to be running on a lot of Ethernets in the near future; if we're documenting Ethernet/IP behaviour, we can't just ignore this and hope it goes away. Cheers, chris ----------
-----------[000039][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 1321-EST (Monday) From: Christopher A Kent <cak@PURDUE.ARPA> To: "David C. Plummer" <DCP%SCRC-TENEX@MIT-MC.ARPA> Cc: Hornig%SCRC-QUABBIN@MIT-MC.ARPA, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, networks%SCRC-TENEX@MIT-MC.ARPA Subject: Re: IP on Ethernet draft RFC
There have been some rumblings about building an encapsulation negotiation protocol, that would allow hosts to exchange information about trailers, mtus, and whatever else might come up. I don't know if I like the idea of different mtus on a single cable, but it might be useful if you need to put small machines on the same cable with big ones. In general, I don't think we should try to legislate a maximum, since there will always be someone who wants to tinker. And anything that is a reason to make all gateways be able to fragment is a good thing, in my book. Cheers, chris ----------
-----------[000040][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 1351-EST (Monday) From: Christopher A Kent <cak@PURDUE.ARPA> To: David C. Plummer <DCP@MIT-MC.ARPA> Cc: tcp-ip@SRI-NIC.ARPA, JSLove@MIT-MULTICS.ARPA, BILLW@SRI-KL.ARPA, Haverty@BBN-UNIX.ARPA, ROGERS@USC-ISIB.ARPA Subject: Re: What is modular, anyway?
Modularity is all well and good, as long as it doesn't preclude
communication between layers. I'm currently involved with a TCP
implementation that doesn't communicate with the IP layer about segment
sizes, and therefore tries to negotiate a max segment size of 1024 on
the arpanet. If you go through a gateway, the tiny fragments at the end
often get lost by an Ethernet or proNET receiver that doesn't cycle
fast enough to pick it up, leading to lost packets, large round trip
times, and generally lousy performance.
I would contend that it is NOT valid for two cooperating TCPs to try to
exchange 5000 octet segments without consulting the underlying IP. If
the IP can't reassemble packets of 5000+80 (or so) octets, you can't
communicate this way! At least as I understand it, a TCP segment must
be wholly encapsulated within a single IP datagram.
If it is known that both IPs can handle datagrams of such size, then I
agree that mtu on intervening (sub-)networks is not an issue; the IP
layer must handle all fragmentation and reassembly, and intervening
gateways should fragment fragments if necessary because of "impedance
mismatches".
------
From: David C. Plummer <DCP@MIT-MC>
Subject: What is modular, anyway?
Date: 3 November 1983 18:38 EST
4/ Gateways must be able to accept packets from their attached networks
which are the maximum packet size for that network technology (although
for some 'technologies' like raw wires that doesn't work). Nets with
snmaller max sizes than 576 are perfectly legal.
Again, NO! Maximum packet size on a particular (sub)network is
determined by management. This actually has nothing to do with
transport layers (e.g. IP)! The implementation of interfaces
should be sufficiently flexible to be able to tell each transport
layer (e.g. IP) that asks it what the maximum segment size is (a
"site variable" determined by management) for the particular
(sub)net to which the interface is attached. IP would ask the
interfaces this in order to determine when it must fragment over
the various interfaces. It must also be prepared to accept each
of these sizes over each of the interfaces. Therefore, it can
take the MIN for packets being transmitted, but it must be
prepared to receive the differing size maximums.
-----
I don't get this. I would have said (and still would say) YES! to item
4. A gateway must be able to accept packets up to the maximum packet
size expected on that network. If your gripe is that the administration
may choose to run with a smaller maximum packet than the technology
allows (i.e. proNET lets me send 2046-octet packets, but I never send a
packet larger than 1536), then yes, I agree with you, though this is a
fine point to me. Perhaps that's why you made the distinction; however,
your reason for objection was (to me) as subtle as the possible
misunderstanding in point 4. I agree that it's the managerial maximum
that must be met by gateways.
Cheers,
chris
----------
-----------[000041][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 14:02:28 EST (Monday) From: Jack Haverty <haverty@BBN-UNIX> To: Christopher A Kent <cak@PURDUE.ARPA> Cc: obrien@rand-unix.ARPA, Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Subject: Re: IP on Ethernet draft RFC
Chris hit the nail on the head -- the task is to define the standard. One question is, standard for what? It seems plausible given the recent messages that a standard might have two sections at first, the 'default' way, and the '4.2bsd' way, which differs from the default for performance reasons. I'm not arguing against that, but I wonder if that means there will be a stream of additions as more machines come on the network in large numbers. Perhaps we'll see a 'backwards-byte' way for some machine, in which all data bytes in the packet are swapped to improve efficiency. And that of course will come in two subversions, with and without trailers. Then maybe there will be a '36-bit' format, for use by consenting 36-bit machines to avoid wasting bits and/or cpu time to pack and unpack. Who knows what else might be appropriate to improve efficiency. Defining a standard seems to me to be very difficult, since all standards are compromises between the advantages of customization and the advantages of uniformity. A standard with only one 'wart' for one special case doesn't look too bad, but is that the end? I wonder why no one is arguing that the IP encapsulation for Ethernets between two machines of type-X should use a stripped-down IP header -- after all, there is no gateway so why worry about fragmentation? Etc., etc., etc. What do people think the purpose of a standard is? To make more canned implementations available? To make it possible for all implementations to interact? To create efficient use of network/host resources? To ease software maintenance? Jack
-----------[000042][next][prev][last][first]---------------------------------------------------- Date: Mon, 7 Nov 83 14:27 EST From: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> To: Christopher A Kent <cak@PURDUE.ARPA>, obrien@RAND-UNIX.ARPA Cc: Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Subject: Re: IP on Ethernet draft RFC
I don't understand your point about header state. All of the network implementations I have worked with read in the whole packet before processing the header. What does this have to do with trailers? Also, the Internet community does not have to implement trailer protocols. You may want to, in order to get better VAX Unix performance, but it is never necessary. I'm sure that a lot of people without Unix systems would rather not do it. I agree that it should be documented. I will put it in an appendix to the RFC if someone will send me a copy of the specification.
-----------[000043][next][prev][last][first]---------------------------------------------------- Date: Mon, 7 Nov 83 14:56:09 EST From: Rob Gurwitz <gurwitz@bbn-vax> To: Christopher A Kent <cak@PURDUE.ARPA> Cc: obrien@rand-unix.ARPA, Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Subject: Re: IP on Ethernet draft RFC
OK, perhaps moderate performance increase is understating the case. However, if millisecond per packet performance is needed for a diskless workstation application (Berkeley's intended goal), then perhaps it's IP that's inappropriate. After all, the purpose of TCP/IP (one of them) is interoperability. Since I can't interoperate with the funny trailer protocol, why use IP? After all, once I go off through a gateway, I'm not going to get my millisecond performance anymore. It sounds like a diskless workstation application using TCP/IP is just wrong. There's too much protocol there. And speaking of modularity, the reason for the trailer protocol in the first place was the performance of some Ethernet drivers with the Unibus. Must I be subjected to some weird variant of IP just to make up for poor performance of some hardware? I really dispute the fact that "we (the Internet community) have to implement both trailer and non-trailer encapsulations." It seems like we are bending inside out for a specific and very narrow case.
-----------[000044][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 1514-EST (Monday) From: Christopher A Kent <cak@PURDUE.ARPA> To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: Rob Gurwitz <gurwitz@BBN-VAX.ARPA>, TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA>, Christopher A Kent <cak@PURDUE.ARPA>, obrien@RAND-UNIX.ARPA Subject: Re: IP on Ethernet draft RFC
I think that the argument about header state is that you have to hang on to a copy of the header. If you're buffer limited, this can be a win; also the overhead of allocating and deallocating the buffer(s) for the header can be avoided. The points about standards and variations is well taken; there should be a base-level encapsulation, which is expected to be implemented by all systems that speak IP on an Ethernet. Jack Haverty, I don't know if there will be a stream of variations on encapsulation in the future. I think that standards are there to make it possible for all implementations to interact, and to document possible variations. Some variations are reasonable; it is not reasonable to cast a poor implementation into stone just because "it's the standard". (Perhaps this is a variation on "rules were made to be broken".) Since in this case we are defining a standard after the fact, we must try to consider all the implementations that exist. (I am reminded of a group of people that built a 3Mb Ether version of Pup without swapping bytes into the pup-defined network natural order; the efficiency hack was totally wrong in this case, which was discovered as soon as they tried to talk to a "real" pup implementation.) I think it's unfortunate that the trailer encapsulation format was implemented without specifying a way for machines wishing to use it to communicate this fact. This is a gross oversight. But ignoring it and hoping it will go away won't work. What I'm trying to insure is that if there are widely accepted variations, we don't just dismiss them out of hand because they aren't of interest to a particular group (say, the writers of the standards). Rob Gurwitz, I would tend to agree with you that IP is out of place in a diskless workstation environment. We live in a world of compromises; not everyone has the resources to implement a new protocol for their local network use, and the translation gateways needed to maintain interoperability with the rest of the Internet. Both of these are reasonable goals. Why begrudge the compromise in this case? I seem to have been pressed into the sole defender of the right for this variation to exist. I don't believe that it's "wrong" just because it's different. I think it's a mistake to ignore it. Am I the only one? Should I consider myself outnumbered and drop it? Cheers, chris ----------
-----------[000045][next][prev][last][first]---------------------------------------------------- Date: Mon 7 Nov 83 19:48:39-PST From: Mathis@SRI-KL.ARPA To: Mike.Accetta@CMU-CS-IUS.ARPA Cc: TCP-IP@SRI-NIC.ARPA, Mathis@SRI-KL.ARPA Subject: Re: IP on Ethernet draft RFC
Mike, Finding gateways/name servers isn't necessarily an aspect of a general resource location problem (if you don't mean general in the extreme), but an aspect of the "getting started" procedures for a given network. I make this distinction since gateways and (to some extent) host-name servers are resources at the network protocol level rather than resources at mail or user levels, for example. In any event, a gateway is more than a box that happens to speak EGP/GGP in the same manner that a name server is more than a TCP/UDP service. The need to "find" gateways/name servers is really a reflection on the inadequacies/limitations on our networks. As a host, I would much rather hand the ARPANET (for example) a message containing an IP datagram and let the IMPs figure our which gateway to route it to rather than have to deal with pinging and built-in gateway tables; or at least let the ARPANET get the message to a functional gateway that will return a redirect with the right gateway address. If gateways responded to ARP requests for host outside of the local Ethernet, then the 48-bit address of the proper, first-hop gateway would be returned to you. The IP module then wouldn't really care if the destination was local or remote for purposes of routing. Among many other items, the "how to connect" RFCs need to describes how a host, starting from ground-zero, knows about its IP address (which may or may not be the same as its LN address), knows how to route packets to places outside its network (which may or may not be the same procedures used for internal routing), and how to resolve names (of external internet entities) to addresses. It has been suggested that a "getting started" protoco/server needs to be developed. -------
-----------[000046][next][prev][last][first]---------------------------------------------------- Date: 7 Nov 1983 21:36:01 PST From: POSTEL@USC-ISIF To: tcp-ip@SRI-NIC Subject: re: TCP Maximum Segment Size Option
< INC-PROJECT, MAX-SEG-SIZ.NLS.14, >, 7-Nov-83 21:32 JBP ;;;;
This memo discusses the TCP Maximum Segment Size and its relation to the
IP Maximum Datagram Size.
This discussion is necessary because the current specification of this
TCP option is ambiguous.
Much of the difficulty with understanding these sizes and their
relationship has been due to the variable size of the IP and TCP
headers.
There have been some assumptions made about using other than the default
size for datagrams with some unfortunate results.
HOSTS MUST NOT SEND DATAGRAMS LARGER THAN 576 OCTETS UNLESS THEY HAVE
SPECIFIC KNOWLEDGE THAT THE DESTINATION HOST IS PREPARED TO ACCEPT
LARGER DATAGRAMS.
This is a long established rule.
To resolve the ambiguity in the TCP Maximum Segment Size option
definition the following rule is established:
THE TCP MAXIMUM SEGMENT SIZE IS THE IP MAXIMUM DATAGRAM SIZE MINUS
FORTY.
The default IP Maximum Datagram Size is 576.
The default TCP Maximum Segment Size is 536.
1. The IP Maximum Datagram Size
Hosts are not required to reassemble infinitely large IP datagrams.
The maximum size datagram that all hosts are required to accept or
reassemble from fragments is 576 octets. The maximum size reassembly
buffer every host must have is 576 octets. Hosts are allowed to
accept larger datagrams and assemble fragments into larger datagrams,
hosts may have buffers as large as they please.
Hosts must not send datagrams larger than 576 octets unless they have
specific knowledge that the destination host is prepared to accept
larger datagrams.
2. The TCP Maximum Segment Size Option
TCP provides an option that may be used at the time a connection is
established (only) to indicate the maximum size TCP segment that can
be accepted on that connection. This Maximum Segment Size (MSS)
announcement (often mistakenly called a negotiation) is sent from the
data receiver to the data sender and says "I can accept TCP segments
up to size X". The size (X) may be larger or smaller than the
default. The MSS can be used completely independently in each
direction of data flow. The result may be quite different maximum
sizes in the two directions.
The MSS counts only data octets in the segment, it does not count the
TCP header or the IP header.
A footnote: The MSS value counts only data octets, thus it does not
count the TCP SYN and FIN control bits even though SYN and FIN do
consume TCP sequence numbers.
3. The Relationship of TCP Segments and IP Datagrams
TCP segment are transmitted as the data in IP datagrams. The
correspondence between TCP segments and IP datagrams must be one to
one. This is because TCP expects to find exactly one complete TCP
segment in each block of data turned over to it by IP, and IP must
turn over a block of data for each datagram received (or completely
reassembled).
4. Layering and Modularity
TCP is an end to end reliable data stream protocol with error
control, flow control, etc. TCP remembers many things about the
state of a connection.
IP is a one shot datagram protocol. IP has no memory of the
datagrams transmitted. It is not possible for IP to keep any
information about the maximum datagram size a particular destination
might be capable of accepting. IP may keep some information for
routing purposes on a per network basis. There is no current
requirement that IP keep information on a per host basis.
TCP and IP are distinct layers in the protocol architecture, and are
often implemented in distinct program modules.
Some people seem to think that there must be no communication between
protocol layers or program modules. There must be communication
between layers and modules, but it should be carefully specified and
controlled. One problem in understanding the correct view of
communication between protocol layers or program modules in general,
or between TCP and IP in particular is that the documents on
protocols are not very clear about it. This is often because the
documents are about the protocol exchanges between machines, not the
program architecture within a machine, and the desire to allow many
program architectures with different cuts at modularizing the
implementation.
5. The Relationship between IP Datagram and TCP Segment Sizes
The relationship between the value of the maximum IP datagram size
and the maximum TCP segment size is obscure. The problem is that
both the IP header and the TCP header may vary in length. The TCP
Maximum Segment Size option (MSS) is defined to specify the maximum
number of data octets in a TCP segment exclusive of TCP (or IP)
header.
To notify the data sender of the largest TCP segment it is possible
to receive the calculation of the MSS value to send is:
MSS = MTU - sizeof(TCPHDR) - sizeof(IPHDR)
On receipt of the MSS option the calculation of the size of segment
that can be sent is:
SndMaxSegSiz = MIN((MTU - sizeof(TCPHDR) - sizeof(IPHDR)), MSS)
where MSS is the value in the option, and MTU is the Maximum
Transmission Unit (or the maximum packet size) allowed on the
directly attached network.
This begs the question, though. What value should be used for the
"sizeof(TCPHDR)" and for the "sizeof(IPHDR)"?
There are three reasonable positions to take: the conservative, the
moderate, and the liberal.
The conservative or pessimistic position assumes the worst -- that
both the IP header and the TCP header are maximum size, that is, 60
octets each.
MSS = MTU - 60 - 60 = MTU - 120
If MTU is 576 then MSS = 456
The moderate position assumes the that the IP is maximum size (60
octets) and the TCP header is minimum size (20 octets), because there
are no TCP header option field currently defined that would normally
be sent at the same time as data segments.
MSS = MTU - 60 - 20 = MTU - 80
If MTU is 576 then MSS = 496
The liberal or optimistic position assumes the best -- that both the
IP header and the TCP header are minimum size, that is, 20 octets
each.
MSS = MTU - 20 - 20 = MTU - 40
If MTU is 576 then MSS = 536
If nothing is said about MSS, the data sender may cram as much as
possible into a 576 octet datagram, and if the datagram has
minimum headers (which is most likely), the result will be 536
data octets in the TCP segment. The rule relating MSS to the
maximum datagram size ought to be consistent with this.
A practical point is raised in favor of the liberal position too.
Since the use of minimum IP and TCP headers is very likely in the
very large percentage of cases, it seems wasteful to limit the TCP
segment data to so much less than could be transmitted at once,
especially since it is less that 512 octets.
For comparison: 536/576 is 93% data, 496/576 is 86% data, 456/576
is 79% data.
6. Maximum Packet Size
Each network has some maximum packet size, or maximum transmission
unit (MTU). Ultimately there is some limit imposed by the
technology, but often the limit is an engineering choice or even an
administrative choice. Different installations of the same network
product do not have to use the same maximum packet size. Even within
one installation not all host must use the same packet size (this way
lies madness, though).
Some IP implementers have assumed that all hosts on the directly
attached network will be the same or at least run the same
implementation. This is a dangerous assumption. It has often
developed that after a small homogeneous set of host have become
operational additional hosts of different types are introduced into
the environment. And it has often developed that it is desired to
use a copy of the implementation in a different inhomogeneous
environment.
Designers of gateways should be prepared for the fact that successful
gateways will be copied and used in other situation and
installations. Gateways must be prepared to accept datagrams as
large as can be sent in the maximum packets of the directly attached
networks. And gateway implementations should be easily configured
for installation in different circumstances.
A footnote: The MTUs of some popular networks, note that the actual
limit in some installations may be set lower by administrative
policy:
ARPANET, MILNET = 1007
Ethernet (10Mb) = 1500
Proteon PRONET = 2046
7. Source Fragmentation
A source host would not normally create datagram fragments. Under
normal circumstances datagram fragments only arise when a gateway
must send a datagram into a network with a smaller maximum packet
size than the datagram. In this case the gateway must fragment the
datagram (unless it is marked "don't fragment" in which case it is
discarded, with the option of sending an ICMP message to the source
reporting the problem).
It might be desirable for the source host to send datagram fragments
if the maximum segment size (default or negotiated) allowed by the
data receiver were larger than the maximum packet size allowed by the
directly attached network. However, such datagram fragments must not
combine to a size larger than allowed by the destination host.
For example, if the receiving TCP announced that it would accept
segments up to 5000 octets (in cooperation with the receiving IP)
then the sending TCP could give such a large segment to the
sending IP provided the sending IP would send it in datagram
fragments that fit in the packets of the directly attached
network.
8. Gateway Fragmentation
Gateways must be prepared to do fragmentation. It is not an optional
feature for a gateway.
Gateways have no information about the size of datagrams destination
hosts are prepared to accept. It would be inappropriate for gateways
to attempt to keep such information. Gateways only know the 576
rule.
Gateways must be prepared to accept the largest datagrams that are
allowed on each of the directly attached networks, even if it is
larger than 576 octets.
Gateways must be prepared to fragment datagrams to fit into the
packets of the next network, even if it smaller than 576 octets.
If a source host thought to take advantage of the local network's
ability to carry larger datagrams but doesn't have the slightest idea
if the destination host can accept larger than default datagrams and
expects the gateway to fragment the datagram into default size
fragments, then the source host is misguided. If indeed, the
destination host can't accept larger than default datagrams, it
probably can't reassemble them either. If the gateway either passes
on the large datagram whole or fragments into default size fragments
the destination will not accept it. Thus, this mode of behavior by
source hosts must be outlawed.
A larger than default datagram can only arrive at a gateway because
the source host knows that the destination host can handle such large
datagrams (probably because the destination host announced it to the
source host in an TCP MSS option). Thus, the gateway should pass on
this large datagram in one piece or in the largest fragments that fit
into the next network.
An interesting conclusion is that even though the gateways know the
576 rule, it is irrelevant to them.
9. Inter-Layer Communication
The Network Driver (ND) or interface should know the Maximum
Transmission Unit (MTU) of the directly attached network.
The IP should ask the Network Driver for the Maximum Transmission
Unit.
The TCP should ask the IP for the Maximum Datagram Data Size (MDDS).
This is the MTU minus the IP header length (MDDS = MTU - IPHdrLen).
When opening a connection TCP can send an MSS option with the value
equal MDDS - TCPHdrLen.
TCP should determine the Maximum Segment Data Size (MSDS) from either
the default or the received value of the MSS option.
TCP should determine if source fragmentation is possible (by asking
the IP) and desirable.
If so TCP may hand to IP segments (including the TCP header) up to
MSDS + TCPHdrLen.
If not TCP may hand to IP segments (including the TCP header) up
to the lesser of (MSDS + TCPHdrLen) and MDDS.
IP checks the length of data passed to it by TCP. If the length is
less than or equal MDDS, IP attached the IP header and hands it to
the ND. Otherwise the IP must do source fragmentation.
10. What is the Default MSS ?
Another way of asking this question is "What transmitted value for
MSS has exactly the same effect of not transmitting the option at
all?".
In terms of the previous section:
The default assumption is that the Maximum Transmission Unit is
576 octets.
MTU = 576
The Maximum Datagram Data Size (MDDS) is the MTU minus the IP
header length.
MDDS = MTU - IPHdrLen = 576 - 20 = 556
When opening a connection TCP can send an MSS option with the
value equal MDDS - TCPHdrLen.
MSS = MDDS - TCPHdrLen = 556 - 20 = 536
TCP should determine the Maximum Segment Data Size (MSDS) from
either the default or the received value of the MSS option.
Default MSS = 536, then MSDS = 536
TCP should determine if source fragmentation is possible and
desirable.
If so TCP may hand to IP segments (including the TCP header) up
to MSDS + TCPHdrLen (536 + 20 = 556).
If not TCP may hand to IP segments (including the TCP header)
up to the lesser of (MSDS + TCPHdrLen (536 + 20 = 556)) and
MDDS (556).
11. The Truth
The rule relating the maximum IP datagram size and the maximum TCP
segment size is:
TCP Maximum Segment Size = IP Maximum Datagram Size - 40
The rule must match the default case.
If the TCP Maximum Segment Size option is not transmitted then the
data sender is allowed to send IP datagrams of maximum size (576)
with a minimum IP header (20) and a minimum TCP header (20) and
thereby be able to stuff 536 octets of data into each TCP segment.
The definition of the MSS option can be stated:
The maximum number of data octets that may be received by the
sender of this TCP option in TCP segments with no TCP header
options transmitted in IP datagrams with no IP header options.
12. The Consequences
When TCP is used in a situation when either the IP or TCP headers are
not minimum and yet the maximum IP datagram that can be received
remains 576 octets then the TCP Maximum Segment Size option must be
used to reduce the limit on data octets allowed in a TCP segment.
For example, if the IP Security option (11 octets) were in use and
the IP maximum datagram size remained at 576 octets, then the TCP
should send the MSS with a value of 525 (536-11).
--jon.
-------
-----------[000047][next][prev][last][first]---------------------------------------------------- Date: Monday, 7 November 1983 19:52:43 EST From: Mike.Accetta@CMU-CS-IUS To: TCP-IP@SRI-NIC.ARPA Subject: Re: IP on Ethernet draft RFC
1) Finding gateways and name servers are two specific examples of a
more general resource location problem. In the absence of a Internet
standard, we are probably going to define a local resource location
protocol on top of IP which looks something like:
OpCode=<Who understands?>
(Protocol-ID, port#)
...
(Protocol-ID, port#)
with a symmetric reply
OpCode=<I understand>
(Protocol-ID, port#)
...
(Protocol-ID, port#)
where "Protocol-ID" is the IP protocol identifier and "port#" is
an (optional) port or some other Protocol-ID specific value. Then,
gateways could be located by broadcasting an
Opcode=<Who understands?>
(3, 0) {GGP}
(8, 0) {EGP}
request and name servers by broadcasting an
OpCode=<Who understands?>
(6, 53) {TCP port 53}
(17, 53) {UDP port 53}
request.
2) This brings up another issue which should probably be specified in
the IP ethernet standard, namely broadcast addressing. One of the
network papers a while back (I think it was an IEN on ethernet address
resolution by, I believe, Rob Gurwitz) suggested an approach which we
are using at CMU. In that scheme, the IP network number and a host part
consisting of all ones, was reserved for use as that network's IP
broadcast address (i.e. address resolution modules would always resolve
this IP address to the physical broadcast address for the underlying
hardware).
3) For what its worth, I'd suggest that the 4.2BSD trailer protocol
issue best be dealt with as a separate RFC written by someone at
Berkeley with perhaps a reference to it in the ethernet standard RFC.
The standard should specify precisely what is REQUIRED when
implementing IP on an ethernet in order to guarantee communication with
ANY other IP implementation on that ethernet. Unless the intent is to
mandate implementation of the trailer protocol encapsulation, I think
it would be ill-advised to include this directly or even as an appendix in
the IP ethernet standard. It will only cause confusion.
- Mike
-----------[000048][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 0511-PST (Tuesday) From: mo@LBL-CSAM (Mike O'Dell[Group-L]) To: BillW@SRI-KL Cc: tcp-ip@NIC Subject: Re: modularity vs efficiency in fragmentation.
There is a more serious issue here. Sending too large a TCP segment can actually cause the connection to fail in the following way. Assume two TCP's negotiate the segment size up to some number quite a bit larger than the max IP size along the route. A segment leaves one TCP in one IP jumbogram. Now, along the way it gets fragmented into 3 tinigrams which get sent along the same path. Now assume the destination host or gateway is short of resources, or more likely, is on a local net which has difficulty hearing back-to-back packets. It is not uncommon (actually obverved) for this case to result in an IP fragment being lost, usually the last one of the bunch. IP finally gives up and discards the irreconcillable fragments, or even worse, holds on to them for a while. Back at the source host, the retransmit timer fires inside TCP and it promptly sends the segment off again, insuring that only part of, and probably fewer useful, IP fragments will be recieved. (Because you tend to hear the first of a burst and lose the rest.) After a while, TCP declares the connection dead since it has not been able to get any segments through. -Mike
-----------[000049][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 0906-PST From: WMartin at Office-3 (Will Martin) To: Feedback at OFFICE-3, SNelson at OFFICE-3 Cc: TCP-IP at SRI-NIC Subject: "Retransmitting" - Can we get rid of it?
Where in the TCP process does the display of the word "Retransmitting" come from? (I think that there is a [CR] at the end of it, too.) Can the code be changed to eliminate the display of this word to the terminal? I have just had quite a large batch of printing ruined because the word "Retransmitting" appears every ten lines or so in the middle of the document. There were no characters lost during this time when the host probably had some sort of problems. If the worthless "Retransmitting" had not been generated and appeared in the midst of the text, this printing would have been perfectly usable as is. It was printed at night, I believe, while the printer was unattended, and it probably came out at some abysmally low speed, but so what? If it wasn't for the "Retransmitting"s garbaging up the text, it could have been used. I realize that it can be useful to a human sitting at a terminal and wondering if the entire network has died, or just the host, or the TAC, or whatever. But I'd rather just put up with the non-echoing and eventual beeping when the buffer (wherever it is) fills up, than have "Retransmitting" displays endlessly repeat down my screen. The fact that these destroy printed output is enough reason alone to eliminate them, since their benefit is far outweighed by the harm that displaying the word does. I'm CC'ing this to a TCP mailing list in the hope that someone who has already fixed this problem can inform us of the proper repair. It is the most visible user-interface effect of TCP and the prime gripe most users have about the changeover. Will Martin USArmy DARCOM ALMSA
-----------[000050][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 0957-PST From: Craig Milo Rogers <ROGERS@USC-ISIB> To: POSTEL@USC-ISIF, tcp-ip@SRI-NIC Subject: re: TCP Maximum Segment Size Option
I have a couple of minor cautions (ie, disagreements) with Jon's message on IP and TCP packet sizes. My complaints center on IP implementations issues, not the TCP issues. 1) Jon stated that there is no current requirement that IP keep information on a per-host basis. There are two counterexamples to this statement: 1) One of the ICMP Redirect subtypes is a per-host redirect. Isn't an IP implementation required to include an ICMP implementation? Or, are some parts of the ICMP specification optional? (Ref. RFC 792 p. 12) 2) The Address Resolution Protocol requires (or at least, strongly encourages) hosts to keep per-host information. The ARP is required to translate IP addresses to local network addresses on some local networks. Is this translation process considered an "IP" responsibility? 2) "A source host would not normally create datagram fragments." I have to approach this statement carefully; "normal" is a tricky word. Let's talk about an IP host connected to a network with a smaller-than-576 octet Maximum Transmission Unit. Here are two circumstances in which it would be quite "normal" for the host to send IP fragments: 1) The host may be using a higher-level protocol path which requires packets of a particular length, ie UDP/TFTP. 2) The host may have implemented the ICMP Echo/Echo-Reply protocol (it's a "required" component of an IP implementation, isn't it?). The host might receive an Echo request (as fragments), reassemble it, interpret it, create an Echo Reply, and be forced to fragment the Echo Reply in order to send it. Craig Milo Rogers -------
-----------[000051][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 10:43:01 PST From: MILLS@USC-ISID To: mo@LBL-CSAM, BillW@SRI-KL Cc: tcp-ip@SRI-NIC, MILLS@USC-ISID Subject: Re: modularity vs efficiency in fragmentation.
In response to the message sent 8 Nov 1983 0511-PST (Tuesday) from mo@LBL-CSAM Mike, Your scenario is in fact common on SATNET paths, where resources are tight and packet losses relatively high. Reasembly congestion can be reduced by careful choice of the IP time-to-live field and minimum TCP retransmission timeout. We set the TTL field so that the 'gram expires of old age before the sending TCP fires off anothre one. This assumes, of course, intervening gateways as well as the destination host, correctly handle the TTL field. Dave -------
-----------[000052][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 11:11-PST From: the tty of Geoffrey S. Goodfellow To: WMartin@OFFICE-3 Cc: Feedback@OFFICE-3, SNelson@OFFICE-3 TCP-IP@SRI-NIC Subject: Re: "Retransmitting" - Can we get rid of it?
Will The word "Retransmitting" comes from your TAC. I would suggest that you notify DCA or BBN of your desire to see a TAC command option added which will disable the printing of the "Retransmitting" message. Geoff
-----------[000053][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 11:16:58 PST From: POSTEL@USC-ISIF To: TCP-IP@SRI-NIC Subject: re: "retransmitting"
Will Martin: It is a "feature" of the TAC. Take it up with the TAC people at BBN. --jon. -------
-----------[000054][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 1155-PST From: GOWER@USC-ISID To: WMartin@OFFICE-3 (Will Martin) Cc: Feedback@OFFICE-3, SNelson@OFFICE-3, TCP-IP@SRI-NIC, GOWER@USC-ISID Subject: Re: "Retransmitting" - Can we get rid of it?
Will, We had the same experience with our printers on our TAC. IF your printer is on a TAC port, you may request the NOC to set the port 'QUIET'. This parameter must be set by the NOC. It is one of several that SHOULD be available to the TAC Liaison, but is NOT. Regards, Neil Gower Former TAC Liaison -------
-----------[000055][next][prev][last][first]---------------------------------------------------- Date: Tue, 8 Nov 83 9:36:22 EST From: Rob Gurwitz <gurwitz@bbn-vax> To: William "Chops" Westfield <BillW @ SRI-KL> Cc: tcp-ip@NIC Subject: Re: modularity vs efficiency in fragmentation.
No. Having to go through an extra fragment/reassembly step IS more costly. Think about it. Since TCP packetization happens anyway (you have to always put a header on the data and figure the checksum) and is the "normal" and hence optimized case, adding yet more handling by IP to break the packet up and then reassemble on the other end is not a gain. What advantage does it have? It certainly doesn't make better use of the wire, TCP might try to send 5000 bytes in a segment, but it will still go out as umpteen 576 (or whatever) byte packets. Why not avoid all the extra work in the first place and have TCP try to segment in such a way as to avoid gratutitous fragmentation.
-----------[000056][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 10:44:29 EST (Tuesday) From: Mike Brescia <brescia@BBN-UNIX> To: POSTEL@USC-ISIF Cc: brescia@BBN-UNIX, tcp-ip@nic Subject: re: TCP Maximum Segment Size Option
Jon, It came as a surprise to me that gateways know the '576 rule'. (Your point 8, 'gateway fragmentation'.) Since you also noticed that it is irrelevant to gateways, I think you followed the same reasoning we did at implementation, that is, there is no parameter in the gateway that has the value 576 (+/- epsilon). Only if a network is declared to be of size 576 (MTU) does this appear in the configuration info for that particular net. There are no (BBN) gateways on networks which have a hardware limit of 576 bytes. Some have been 'administratively' limited to 576, viz. the NTA ring (Proteon hardware). Some networks have been adminstratively limited to 256 bytes, viz. those at UCL, to match the 256 byte size of the satnet interface. Mike
-----------[000057][next][prev][last][first]---------------------------------------------------- Date: 8 Nov 1983 12:06:16 EST (Tuesday) From: Jonathan Dreyer <jdreyer@BBN-UNIX> To: Charles Hornig <Hornig%SCRC-QUABBIN@MIT-MC.ARPA> Cc: TCP-IP@SRI-NIC.ARPA, Postel@USC-ISIF.ARPA, networks%SCRC-TENEX@MIT-MC.ARPA Subject: Re: IP on Ethernet draft RFC
I know that the IP world generally talks big-endian (high byte first), but I'm sure that some poor PDP-11 programmer will put a hex 800 in the Ethernet type field and get it backwards unless you say something about byte order in the RFC. In RFC 870 (Assigned numbers), the hex Ethernet type fields are written like "08,00" which makes this more obvious.
-----------[000058][next][prev][last][first]---------------------------------------------------- Date: Tue, 8 Nov 83 16:22:24 EST From: Ron Natalie <ron@brl-vgr> To: Mike O'Dell@BRL-VGR.ARPA, mo@lbl-csam Cc: BillW@sri-kl, tcp-ip@sri-nic Subject: Re: modularity vs efficiency in fragmentation.
1. For those of you who may have forgotten, gateways do not reassemble fragments. 2. The problem Mike ODell outlines does demonstrate a problem. Our TCP/IP was setting the data size to 1024 when talking to Berkeley who also liked to use that size. Packets were fragmented going through the ARPANET. When they reached the BRL-GATEWAY a 1004 byte fragment was pushed through our local network and then the 44 byte second fragment was sent. Due to a programming error of the network hardware while the remote host was processing the large 1004 packet the GATEWAY would get a busy notice. Believing this was a hard error the 44 byte packet was discarded. TCP retransmits were also lost because the smaller fragment was always lost behind the processing of the larger one. Of course there was no excuse for this behaviour, it was just that it was difficult to detect since it would only happen on packets that were fragmented with a large part. 3. If you wish to find out how big you can set the packet size before you can't send them without fragmenting, why not set the DF bit on a probe packet to see if it will fit through all the intervening network interfaces. -Ron
-----------[000059][next][prev][last][first]---------------------------------------------------- Date: Tue, 8 Nov 83 16:37:07 EST From: Ron Natalie <ron@brl-vgr> To: Will Martin <WMartin@office-3> Cc: Feedback@office-3, SNelson@office-3, TCP-IP@sri-nic Subject: Re: "Retransmitting" - Can we get rid of it?
It's the TAC that says "Retransmitting." I don't know if anything can be done to stop it. The word "retransmitting" really is not a part of TCP/IP except that the TAC decides to print it whenever it encounters a particular condition. -Ron
-----------[000060][next][prev][last][first]---------------------------------------------------- Date: 9-Nov-83 01:51 PST From: Robert N. Lieberman <RLL.TYM@OFFICE-2> To: WMartin at Office-3 (Will Martin) Cc: Feedback@OFFICE-3, SNelson@OFFICE-3, TCP-IP@SRI-NIC Subject: Re: "Retransmitting" - Can we get rid of it?
Will I guess by now you got the word about the QUIET setting that BBN can implement for you. However, i am VERY curious about this. As you know we have had TCP related problems for 10 months. It is has generally been assumed that the problems are net load related, i.e., they appear only when the net is sufficiently loaded between one TAC and one HOst (maybe the local IMP being loaded plays a role too). We based this on not being able to duplicate any of the problems at night even with extremely heavy loads on host and host to host via net loads. The problem, of course, is one of slowness with eventual disconnection. RETRANSMITTING has been reported many times as one of the messages that occurs when this problems begins to rear its head. Now, if I read your message right, you were getting this message at night, presumedly when loads were low on the net (and host). If this can be repeated then I would think BBN and TYM would be VERY interested in exactly what was happening. It may not relate to the 'main' TCP problem but we don't want to leave a bit unflipped. Hopefully BBN will soon have their IMP trace package working so we can begin to see some data from the IMP (the key element missing for the last 10 months). Robert
-----------[000061][next][prev][last][first]---------------------------------------------------- Date: 9 Nov 1983 8:11:24 EST (Wednesday) From: Jack Haverty <haverty@BBN-UNIX> To: Ron Natalie <ron@brl-vgr> Cc: Mike O'Dell@BRL-VGR.ARPA, mo@lbl-csam, BillW@sri-kl, tcp-ip@sri-nic Subject: Re: modularity vs efficiency in fragmentation.
Re setting the DF bit to see if a certain packet size will fit through the internet -- in the current physical topology this might be useful, since there are few cases where alternate paths exist. In general however, since the internet is performing routing, there's no guarantee that successive packets take the same path from source to destination. Jack
-----------[000062][next][prev][last][first]---------------------------------------------------- Date: 10 November 1983 15:32 EST From: David C. Plummer <DCP @ MIT-MC> To: Mike.Accetta @ CMU-CS-IUS Cc: TCP-IP @ SRI-NIC Subject: Re: IP on Ethernet draft RFC
Received: from CMU-CS-IUS by SRI-NIC with TCP; Mon 7 Nov 83 16:54:50-PST
Date: Monday, 7 November 1983 19:52:43 EST
From: Mike.Accetta@CMU-CS-IUS
Opcode=<Who understands?>
(3, 0) {GGP}
(8, 0) {EGP}
request and name servers by broadcasting an
OpCode=<Who understands?>
(6, 53) {TCP port 53}
(17, 53) {UDP port 53}
How do you do
(Chaos, DUMP-ROUTING-TABLE)
?? Not all the world is reduced to numbers. There are some
protocols out there that try to be mnemonic.
-----------[000063][next][prev][last][first]---------------------------------------------------- Date: Thursday, 10 November 1983 17:05:05 EST From: Mike.Accetta@CMU-CS-IUS To: David C.Plummer <DCP@MIT-MC> Cc: TCP-IP@SRI-NIC Subject: Re: IP on Ethernet draft RFC
David, I'd had no experience with CHAOS protocols so I have to admit that I tend to think in terms of the numbers those IP protocols with which I am familar use. The examples are perhaps a bit biased. Nothing precludes the interpretation of the protocol specific portion of the query as a string if that is appropriate for the protocol. It certainly must be of variable length to be useful for more than UDP and TCP and especially for protocols we can't forsee right now. I would argue, however, that whatever the interpretation of this field, it be fixed per protocol and that it should be the "natural" identifier for that protocol. Thus GGP and EGP perhaps have no identifier, UDP and TCP use 16-bit port numbers, the CHAOS stream protocol uses names, etc. - Mike
-----------[000064][next][prev][last][first]---------------------------------------------------- Date: Tuesday, 15 November 1983, 22:12-EST From: Charles Hornig <Hornig at SCRC-QUABBIN> To: TCP-IP at nic Subject: Trailer Encapsulations
Here is a draft RFC on Trailer Encapsulations. I feel that it is most
understandable if it is made a separate document from the IP/Ethernet
RFC. Comments?
---------
Sam Leffler & Make Karels
University of California at Berkeley
November 1983
Trailer Encapsulations
This RFC discusses the motivation for use of "trailer encapsulations" on
local-area networks and describes the implementation of such an
encapsulation on various media.
Introduction
A trailer encapsulation is a link level packet format employed by
4.2BSD UNIX (among others). A trailer encapsulation, or "trailer", may
be generated by a system under certain conditions in an effort to
minimize the number and size of memory-to-memory copy operations
performed by a receiving host when processing a data packet. Trailers
are strictly a link level packet format and are not visible (when
properly implemented) in any higher level protocol processing. This
note cites the motivation behind the trailer encapsulation and
describes the trailer encapsulation packet formats currently in use on
3 Mb/s and 10 Mb/s Ethernets, and 10 Mb/s V2LNI ring networks.
The use of a trailer encapsulation was suggested by Greg Chesson,
and the encapsulation described here was designed by Bill Joy.
Motivation
Trailers are motivated by the overhead which may be incurred during
protocol processing when one or more memory to memory copies must be
performed. Copying can be required at many levels of processing, from
moving data between the network medium and the host's memory, to
passing data between the operating system and user address spaces. An
optimal network implementation would expect to incur zero copy
operations between delivery of a data packet into host memory and
presentation of the appropriate data to the receiving process. While
many packets may not be processed without some copying operations, when
the host computer provides suitable memory management support it may
often be possible to avoid copying simply by manipulating the
appropriate virtual memory hardware.
In a page mapped virtual memory environment, two prerequisites are
usually required to achieve the goal of zero copy operations during
packet processing. Data destined for a receiving agent must be aligned
on a page boundary and must have a size which is a multiple of the
hardware page size (or filled to a page boundary). The latter
restriction assumes virtual memory protection is maintained at the page
level, different architectures may alter these prerequisites.
Data to be transmitted across a network may easily be segmented in
the appropriate size, but unless the encapsulating protocol header
information is fixed in size, alignment to a page boundary is virtually
impossible. Protocol header information may vary in size due to
the use of multiple protocols (each with a different header), or it
may vary in size by agreement (for example, when optional information
is included in the header). To insure page alignment the
header information which prefixes data destined for the receiver
must be reduced to a fixed size; this is normally the case at the
link level of a network. By taking all (possibly) variable length
header information and moving it to after the data segment a sending
host may "do its best" in allowing the receiving host the opportunity
to receive data on a page aligned boundary. This rearrangement of
data at the link level to force variable length header information
to "trail" the data is the substance of the trailer encapsulation.
There are several implicit assumptions in the above argument.
1. The receiving host must be willing to accept trailers. As this
is a link level encapsulation, unless a host to host negotiation
is performed (preferably at the link level to avoid violating
layering principles), only certain hosts will be able to converse,
or their communication may be significantly impaired if trailer
packets are mixed with non-trailer packets.
2. The cost of receiving data on a page aligned boundary should be
comparable to receiving data on a non-page aligned boundary. If
the overhead of insuring proper alignment is too high, the savings
in avoiding copy operations may not be cost effective.
3. The size of the variable length header information should be
significantly less than that of the data segment being transmitted.
It is possible to move trailing information without physically
copying it, but often implementation constraints and the
characteristics of the underlying network hardware preclude
merely remapping the header(s).
4. The memory to memory copying overhead which is expected to be
performed by the receiver must be significant enough to warrant
the added complexity in the both the sending and receiving host
software.
The first point is well known and the motivation for this note.
Thought has been given to negotiating the user of trailers on a
per host basis using a variant of the Address Resolution Protocol
(actually augmenting the protocol), but at present all systems
using trailers require hosts sharing a network medium to uniformly
accept trailers or never transmit them. (The latter is easily
carried out at boot time in 4.2BSD without modifying the operating
system source code.)
The second point is (to our knowledge) moot. While a host may
not be able to take advantage of the alignment and size properties
of a trailer packet, it should nonetheless never hamper it.
Regarding the third point, let us assume the trailing header
information is copied and not remapped, and consider the header
overhead in the TCP/IP protocols as a representative example . If we
assume both the TCP and IP protocol headers are part of the variable
length header information, then the smallest trailer packet (generated
by a VAX) would have 512 bytes of data and 40+ bytes of header
information (plus the trailer header described later). While the
trailing header could have IP and/or TCP options included this would
normally be rare (one would expect most TCP options, for example, to be
included in the initial connection setup exchange) and certainly much
smaller than 512 bytes. If the data segment is larger, the ratio
decreases and the expected gain due to fewer copies on the receiving
end increases. Given the relative overheads of a memory to memory copy
operation and that of a page map manipulation (including translation buffer
invalidation), the advantage is obvious.
The fourth issue, we believe, is actually a non-issue. In our implementation
the additional code required to support the trailer encapsulation amounts
to about a dozen lines of code in each link level "network interface
driver". The resulting performance improvement more than warrants this
minor investment in software.
It should be recognized that modifying the network (and normal link)
level format of a packet in the manner described forces the receiving
host to buffer the entire packet before processing. Clever
implementations may parse protocol headers as the packet arrives to
find out the actual size (or network level packet type) of an incoming
message. This allows these implementations to avoid preallocating
maximum sized buffers to incoming packets which it can recognize as
unacceptable. Implementations which parses the network level format on
the fly are violating layering principles which have been extolled in
design for some time (but often violated in implementation). The
problem of postponing link level type recognition is a valid
criticism. In the case of network hardware which supports DMA both
arguments are moot.
Trailer Encapsulation Packet Formats
In this section we describe the link level packet formats used
on the 3 Mb/s and 10 Mb/s Ethernet networks as well as the 10 Mb/s
V2LNI ring network. The formats used in each case differ only
in the format and type field values used in each of the local area
network headers.
The format of a trailer packet is shown in the following diagram.
+----+-------------------------------------------------+----+
| LH | data | TH |
+----+-------------------------------------------------+----+
^ ( ^ ) ^
LH:
The fixed-size local network header. For 10 a Mb/s
Ethernet, the 16-byte Ethernet header. The type field
in the header indicates that both the packet type (trailer)
and the length of the data segment.
For the 10 Mb/s Ethernet, the types are between
1001 and 1010 hexadecimal (4096 and 4112 decimal).
The type is calculated as 1000 (hex) plus the number
of 512-byte pages of data. A maximum of 16 pages of
data may be transmitted in a single trailer packet
(8192 bytes).
data:
The "data" portion of the packet. This is normally
only data to be delivered to the receiving processes
(i.e. it contains no TCP or IP header information).
Data size is always a multiple of 512 bytes.
TH:
The "trailer". This is actually a composition of
the original protocol headers and a fixed size
trailer prefix which defines the type and size
of the trailing data. The format of a trailer
is shown below.
The carats (^) indicate the page boundaries the receiving host
is expected to use in receiving a trailer packet. The link level
receiving routine is able to locate the trailer using the size
indicated in the link level header's type field. The receiving
routine is expected to discard the link level header and trailer prefix,
and remap the trailing data segment to the front of the packet to
regenerate the original network level packet format.
Trailer Format
+----------------+----------------+------~...~----------+
| TYPE | HEADER LENGTH | ORIGINAL HEADER(S) |
+----------------+----------------+------~...~----------+
Type: 16 bits
The type field encodes the original link level type of the
transmitted packet. This is the value which would normally
be placed in the link level header if a trailer were not
generated.
Header length: 16 bits
The header length field of the trailer data segment. This
specifies the length in bytes of the following header data.
Original headers: <variable length>
The header information which logically belongs before the
data segment. This is normally the network and transport
level protocol headers.
Summary
A link level encapsulation which promotes alignment properties
necessary for the efficient use of virtual memory hardware facilities
has been described. This encapsulation format is in use on many
systems and is a standard facility in 4.2BSD UNIX. The encapsulation
provides an efficient mechanism by which cooperating hosts on a local
network may obtain significant performance improvements. The use of
this encapsulation technique currently requires uniform cooperation
from all hosts on a network; hopefully a per host negotiation mechanism
may be added to allow consenting hosts to utilize the encapsulation in
a non-uniform environment.
-----------[000065][next][prev][last][first]---------------------------------------------------- Date: 16 Nov 1983 9:58:17 EST (Wednesday) From: Morton D. Hoffman <mdh@BBN-UNIX> To: tcp=ip@nic Cc: TCP-IP@nic, mdh@BBN-U