Fundamentals of Traffic Modeling
Internet traffic data are ferocious. Their statistical properties are complex and databases are very large. The protocols are complex and introduce feedback into the traffic system. Added to this is the vastness of the Internet network topology. This challenges analysis and modeling. Most Internet traffic data can be thought of as time data: a point process, a marked point process, or a time series. The start times of TCP connection flows for HTTP on an Internet wire are a point process. If we add to each of these start times the file size downloaded from the server to the client, the result is a marked point process. Byte counts of aggregate traffic summed over equally spaced intervals are a time series.
The aggregate HTTP start times on an Internet Wire are a superposition of traffic sources. This is true in general for traffic variables on live Internet wires. For example, aggregate packet processes and aggregate byte counts are a superposition of traffic sources. It is vital to exploit superposition to uncover the characteristics of Internet traffic. In so doing, we exploit the fundamental structure of the traffic. We can operate mathematically, using the theory of superposition of point processes, marked point processes, and time series. We can operate empirically, studying the data and how it changes as the number of sources changes.
The notion of how we define a source for analysis purposes needs more thought and trial with data. We can take sources to be users. However, a network is often a network of sub-networks. So we could take each source in such a case to be the traffic of one sub-network.
But there is another method of approaching superposition
that avoids explicit identification of sources.