Документ взят из кэша поисковой машины. Адрес оригинального документа : http://mirror.msu.net/pub/rfc-editor/rfc-ed-all/pdfrfc/rfc1296.txt.pdf
Дата изменения: Wed Mar 27 23:08:09 2002
Дата индексирования: Tue Oct 2 18:33:45 2012
Кодировка:

Поисковые слова: rigel
Network Working Group Request for Comments: 1296

M. Lottor SRI International Network Information Systems Center January 1992

Internet Growth (1981-1991) Status of this Memo This memo provides information for the Internet community. It does not specify an Internet standard. Distribution of this memo is unlimited. Abstract This document illustrates the growth of the Internet by examination of entries in the Domain Name System (DNS) and pre-DNS host tables. DNS entries are collected by a program called ZONE, which searches the Internet and retrieves data from all known domains. Pre-DNS host table data were retrieved from system archive tapes. Various statistics are presented on the number of hosts and domains. Table of Contents Introduction.................................................... How ZONE Works.................................................. Problems with Data Collection................................... Scope of the Study.............................................. N. Results...................................................... N.1 Number of Internet Hosts.................................... N.2 Number of Domains........................................... N.3 Distribution of IP Addresses per Host....................... N.4 Distribution of Hosts by Top-level Domain................... N.5 Distribution of Hosts by Host Name.......................... Future Issues................................................... RFC References.................................................. Security Considerations......................................... Author's Address................................................ Introduction This document provides statistics on the growth of the Internet by examining the number of Internet hosts and domains over a 10-year period. Before the Domain Name System was established, practically all hosts on the Internet were registered with the Network Information Center (SRI-NIC) and entries were placed in the Official Host Table for each one. Data on the number of hosts for pre-DNS 1 2 3 3 4 4 6 7 7 8 8 9 9 9

Lottor

[Page 1]


RFC 1296

Internet Growth (1981-1991)

January 1992

years comes from copies of the host table at selected times. The DNS system was introduced around 1984 but took almost 4 years before it was fully implemented on the Internet. However, by this time many hosts were no longer registered in the Host Table. In 1986, the ZONE (Zealot Of Name Edification) program was written. ZONE was originally intended to be used during the host-table-to-DNS transition period. ZONE would "walk" the DNS tree and build a host table of all the information it collected. This host table could then be used by sites that had not yet made the DNS transition. However, ZONE was never used for this purpose. Instead, it was found to be useful for collecting statistics on the size of the domain system and the Internet. ZONE could not collect complete data on the DNS until around 1988, because early versions of BIND (the popular Unix DNS implementation) had major problems with the zone transfer function of the DNS protocol. ZONE has been used in varying ways ever since to collect this information. In the first few years, it was used to produce a wall-size chart of the domain tree. However, the number of domains quickly outgrew the size of the wall and the charts were abandoned. In later years, statistics on the number of hosts and domains were extracted from the resulting host table, sometimes categorizing data based on top-level domain names or on computer system type or manufacturer. The time to gather the data also grew from hours to a week, and the size of the host table produced soon reached 50 megabytes. In order to reduce the amount of data collected, ZONE is now run in a mode collecting only host names and IP addresses, ignoring protocol, host information and MX record data. The host table is then groveled over by some utilities (such as sort, uniq and grep) to produce the statistics required. ZONE is currently run every 3 months at SRI. How ZONE Works ZONE maintains a list of domains and their servers and a flag indicating whether information for a domain has been successfully loaded from one of the servers. Because of another bug in BIND, ZONE must be primed with a list of all the top-level domains and their name servers. It then cycles through the domain list, attempting to contact one of the servers for each domain not yet transferred. When a server is contacted (via TCP), a Start of Authority (SOA) query is first sent to make sure the server is authoritative for the domain being requested. If so, then a zone transfer query (AXFR) is sent to request all the resource records for the domain to be retrieved. When a name server record (NS) is received, the referenced domain and

Lottor

[Page 2]


RFC 1296

Internet Growth (1981-1991)

January 1992

server are added to the list of domains to process. When host records (A, CNAME, HINFO, MX) are received, they are added to an incore table of host information. The program ends when it has cycled through the entire list of domains without receiving any new information. It then dumps the table of host information to a HOSTS.TXT format file. Problems with Data Collection For various reasons, some Internet sites do not allow zone transfers of their domain servers. ZONE also eventually gives up trying to transfer a domain after too many failures. The number of domains that could not be zone transferred during the 1-Jan-92 ZONE run was around 800 out of 17,000. Additionally, it is assumed that not all hosts on the Internet are registered in a domain server. These problems cause the statistics gathered by ZONE to be lower than the actual amounts. Manual review of some of the data collected by ZONE also shows a lot of random entries in the DNS. Misformatted entries may cause bogus server or host records to appear. Many times a server is found to not be authoritative for the domain listed. Sometimes entire domains are renamed and their old entries left in place for a transition period, thus causing each host within that domain to be counted twice. These problems cause the results of ZONE to be higher than the actual amounts. Manual scanning of the data indicates that the additional entries are insignificant compared to the missing entries discussed earlier. ZONE data can thus be viewed as the minimum number of Internet hosts, and not the actual figures. A final problem with data collection is that of expense. Downloading domain information from every domain on the Internet generates a large amount of network traffic. It also puts an extra CPU load on each domain server it must contact. An organized effort might be considered to have only one such program doing this on the Internet at regularly scheduled intervals to keep the problem of multiple data collectors from occurring. Scope of the Study A problem with counting hosts and domains on the Internet is defining what the Internet really is. Finding host entries in the DNS does not necessarily indicate that the host is reachable from the Internet. Many companies have mail gateways between the Internet and their local nets, thus disallowing direct access. However, some of these companies advertise all their hosts, and some advertise only

Lottor

[Page 3]


RFC 1296

Internet Growth (1981-1991)

January 1992

the gateway.

Are these hosts on the Internet or not?

Furthermore, many domains in the DNS are just mail-forwarding (MX) entries for off-Internet (such as Usenet) sites. Are these domains really part of the Internet and should they be counted in an Internet size study? For the purposes of this study, a host has been defined as a [name(s),IP-address(es)] grouping discovered from the DNS. This prevents us from counting a host with multiple names or addresses more than once. However, this does not consider whether the host is directly accessible or not. When ZONE counts the number of domains it includes all domains referenced by an NS record in the DNS, thus including MX-only domain sites in the final results. N. Results This section presents data from archive tapes of SRI-NIC from 1981 to 1986, and statistics gathered by runs of ZONE from 1986 to 1992. N.1 Number of Internet Hosts The are by the chart below shows the number of IP hosts on the Internet. These hosts with at least one IP address assigned. Data was collected ZONE except where noted. The following two sections are graphs of data in this chart. Date 08/81 05/82 08/83 10/84 10/85 02/86 11/86 12/87 07/88 10/88 01/89 07/89 10/89 10/90 01/91 07/91 10/91 01/92 Hosts 213 235 562 1,024 1,961 2,308 5,089 28,174 33,000 56,000 80,000 130,000 159,000 313,000 376,000 535,000 617,000 727,000 Host Host Host Host Host Host table table table table table table #152 #166 #300 #392 #485 #515

Lottor

[Page 4]


RFC 1296

Internet Growth (1981-1991)

January 1992

Number of Internet Hosts (linear) 800| 780| 760| 740| * 720| 700| 680| . 660| 640| 620| 600| T * 580| h 560| o 540| u 520| s * 500| a 480| n . 460| d 440| s 420| . 400| o 380| f 360| * 340| H . 320| o 300| s * 280| t 260| s . 240| . 220| . 200| . 180| . 160| 140| * 120| * 100| .. 80| * 60| . 40| * 20| ..*...* 0|...*....*......*......*.....*.*....*... ------------------------------------------------------------------8 8 8 8 8 8 8 8 8 9 9 9 1 2 3 4 5 6 7 8 9 0 1 2 Date "*" = data point, "." = estimate This graph is a linear plot of the number of Internet hosts.

Lottor

[Page 5]


RFC 1296

Internet Growth (1981-1991)

January 1992

Number of Internet Hosts (logarithmic)

| 1000000 | *.* | ..*.*..* | ... | 100000 ..** | *.* H| ...* o| .* s | 10000 .. t| .. s| ....* | ...*.* 1000| ...*.. | ... | ...* | ..*....*... 100|. ------------------------------------------------------------------8 8 8 8 8 8 8 8 8 9 9 9 1 2 3 4 5 6 7 8 9 0 1 2 Date "*" = data point, "." = estimate

This graph is a logarithmic plot of the number of Internet hosts. N.2 Number of Domains This chart shows the number of domains existing in the Internet Domain Name System as collected by ZONE. Date 07/88 10/88 01/89 07/89 10/89 10/90 01/91 07/91 10/91 01/92 Domains 900 1,280 2,600 3,900 4,800 9,300 11,200 16,000 18,000 17,000

Lottor

[Page 6]


RFC 1296

Internet Growth (1981-1991)

January 1992

N.3 Distribution of IP Addresses per Host This chart shows how many hosts have how many IP addresses. This data was collected on 1-Jan-92 and only the first 10 entries are shown. Addresses 1 2 3 4 5 6 7 8 9 10 Hosts 715143 9015 1027 556 314 213 100 85 58 71

N.4 Distribution of Hosts by Top-level Domain This chart shows the number of hosts per top-level domain only) on 1-Jan-92. The percentage listed is the increase Oct-91. Large variations are probably due to problems and in the collection process; these figures are not meant to authoritative, but serve as reasonable estimates. 243020 181361 46463 31622 31016 27492 27052 19117 18984 18473 edu com gov au de mil ca org uk se 13% 12% 13% 19% 20% 26% 22% 10% 139% 34% 13011 12770 12647 11994 10228 8579 4109 3324 2719 2020 fr 4% nl 21% ch 10% fi 15% no 9% jp 6% net -49% at 19% it 197% il 14% 1791 1662 1506 1111 1016 929 784 484 448 374 dk 4% es 15% kr 9% nz -16% tw n/a za n/a pt n/a sg 251% hk 78% ie -7% 357 334 308 284 207 146 127 25 24 6 (top 40 since 1variations be

be gr br mx is pl us tn hu arpa

-5% 14% 26% -5% 0% 97% 25% 0% 71% 0%

Lottor

[Page 7]


RFC 1296

Internet Growth (1981-1991)

January 1992

N.5 Distribution of Hosts by Host Name This chart shows the distribution of hosts by their host name on 1Jan-92. The host name is defined to be the first part of a fully qualified domain name. Only the top 100 names are shown. 384 356 323 288 286 285 282 262 260 259 258 254 240 234 233 224 222 213 209 207 venus pluto mars jupiter saturn pc1 zeus iris mercury mac1 orion mac2 newton neptune pc2 gauss eagle mac3 merlin cisco 204 201 201 198 198 196 195 194 191 190 189 189 186 186 185 185 179 179 177 172 mac4 hobbes hermes thor sirius gw calvin mac5 mac10 fred titan pc3 opus mac6 charon apollo mac7 athena alpha mozart 172 172 170 169 169 169 168 168 167 167 167 163 162 160 159 158 158 157 156 155 mac9 mac11 mac8 phoenix mac12 hal snoopy mac13 mac15 mac14 grumpy gandalf pc4 uranus mac16 sleepy io earth europa rigel 155 155 153 152 151 151 150 150 146 145 145 144 144 142 142 141 141 140 140 140 pollux frodo helios mac17 vega mac18 falcon bach castor sol dopey mac20 mac19 spock euler mickey atlas maxwell happy doc 138 136 135 135 135 133 131 131 131 130 128 127 127 126 125 125 124 123 123 122 chaos bart pc5 larry cs odin tiger sparky ariel sneezy mac sun1 rocky pc6 hydra homer isis moe delta pc10

Future Issues ZONE currently runs on a DECsystem-20 and is written in assembler. The amount of data is quickly reaching the limits of the DEC-20 section address space, and the hardware's ability to survive gets slimmer each day. ZONE assembles all its data in core before dumping it to disk. The implementation does this in order to be able to match host nicknames with official names before dumping complete host records. Sometimes a nickname can be in a different domain than the official name, complicating simpler methods. A new version of ZONE needs to be written to run on a modern computer system. A completely new architecture should be designed to handle the enormous amount of data collected and expected in the future. Data should be kept on disk so that a system crash will not wipe out days of collection. Multiple zone transfers could be occurring in parallel to reduce the time needed for data gathering. A new ZONE might run continuously, cycling through the domain system on a cycle lasting weeks to a month, updating a local database with statistics collected for each domain. In this way, current statistics on the size of the Internet would always be known. The resulting database

Lottor

[Page 8]


RFC 1296

Internet Growth (1981-1991)

January 1992

may also be useful for other network information services. RFC References Libes, D., "Choosing a Name for Your Computer", RFC 1178, Integrated Systems Group/NIST, August 1990. (Also FYI 5.) Mockapetris, P., "Domain Names - Implementation and Specification", RFC 1035, USC/Information Sciences Institute, November 1987. Mockapetris, P., "Domain names - Concepts and Facilities", RFC 1034, USC/Information Sciences Institute, November 1987. Lazear, W., "MILNET Name Domain Transition", RFC 1031, Mitre, November 1987. Harrenstien, K. Stahl, M., and J. Feinler, "DoD Internet Host Table Specification", SRI, October 1985. Postel, J., "Domain Name System Implementation Schedule - Revised", RFC 921, USC/Information Sciences Institute, October 1984. Security Considerations Security issues are not discussed in this memo. Author's Address Mark K. Lottor SRI International Network Information Systems Center 333 Ravenswood Avenue, EJ282 Menlo Park, CA 94025 EMail: mkl@nisc.sri.com

Lottor

[Page 9]