Week #02
Quick note — I've found it more helpful to write by blog posts as I'm doing the work and not after. Kindly take this into consideration while reading :-)
I started with just running the traceroute
command on a bunch of urls I visit often and see what I get. I noticed that after a few hops it almost always starts outputting asterisks — in class it was mentioned that asterisks mean there's a problem, a trouble accessing the next path. I thought this would only happen if the address is actually down, but the addresses I tried were very much alive when accessing through the browser. After reading some more I learned that there could be a number of reasons for the traceroute
command to result in asterisks, namely:
traceroute
command so as to reduce network load.traceroute
command is sending a little query it expects to get an answer for with a progressively increasing time value — this number is basically how many nodes said packet can go through the network to reach the destination. First hop is just 1, so it returns almost immediately, and so on. The node limit increases with every hop until the destination is reached — at which point the destination server might respond differently or not at all, hence the asterisks(?)
I used the Traceroute Mapper tool to visualize some routes on a map:
It seemed like all routes always go through two nodes: one in Hicksville, and one in Rhinebeck. I know Optimum (my ISP at home) has facilities in Hicksville and I guess Rhinebeck appears to be another central node in Optimum's network infrastructure, from which there might be a direct path to the destination? I also realize that the numbers are corresponding to the hops (max number does correspond to the number of entries on the route printed) but not sure why I'm not seeing all the points in between?
I realized the first two hops are basically me (my router and inner network?)— as they remained the same with every traceroute
. The next few nodes are mostly the same for all routes and they lead to my ISP. Some lead to Hicksville and some to Rhinebeck (some hops included multiple IPs, and I learned its probably Load Balancing — "...a technique used to distribute network traffic across multiple routers or network interfaces to ensure that no single path becomes overloaded"). Anyway, the yellow cells are the first nodes per route that go somewhere "outside" — out of my ISP's network:
traceroute
packets...
Then I was interested in getting my browsing history. First I had to get that file — I googled and asked ChatGPT and got general instructions. I'm not using chrome but by following a similar path I located the history file for my browser (Arc — which is chromium-based too).
Tried opening that file but it turned out to be quite heavy (108MB) and illegible for the most part. That file was created on January 27, 2023 so it has been collecting my history for quite a while (maybe in the future I could get that file for every browser I've used in the past 10 years and see trends and differences in my browsing habits over time?).
I then looked up ways to do that with the SQLite library and managed to print my top 50 most visited sites, which... did not make much sense to me. Some of the entries — like Gmail, YouTube, and clients' websites — make total sense. But a lot of others, like some spreadsheets for students to sign up for office hours with me, or some address on Google Maps — there's no way I visited these urls so many times! This might be an Arc (my browser) thing. It is possible it keeps some favorites accessible at all times and that could explain why it might be loading some urls all the time even if I never requested to open them.
I was wondering what other information I could get and asked ChatGPT for help with the commands. It helped me get a list of visits per hour from which I made a chart because why not
I wanted to make a map of "my network" by just putting screenshots from the Traceroute Mapper one on top of each other but the scale was too large to see anything useful, so I created a Google Maps list instead. As I was looking through the places on Maps I saw most of the locations seem like commercial buildings, and I get why the companies are not listed on the map, but one location was just this house in Jersey... And I doubt the route to facebook goes through a hidden server farm in this place — the whole area seems to be entirely residential...
traceroute
on my website's address lead to me? Or to my ISP?