Wednesday, June 5, 2013

The Inaugural Post

This post won't be shocking. It won't be amazing. You may even forget it ten minutes from now. That is, until you need to do something similar.

I was doing some research on a bunch of URLs and many of their names were not indicative of their use. Unfortunately, the web-filtering category that they belong to is a bit broad and I needed more fine-grained information on what kind of content they contained. I really needed to see the sites for myself.

I could have opened a web browser, highlighted the URL, Ctrl-C, Ctrl-V, etc. but with over 100 sites to look at that wasn't going to happen. There are already going to be more manual steps than I would like in this process, and I'm WAY too lazy efficient for all that.

PowerShell to the rescue.

Before we get started, it's important to note that there was no reason to be overly concerned with malicious content in this case. Don't do this if you're investigating the source of a malware outbreak, for example, unless you're actually trying to get your grubby little hands on the malware for fun or profit.

Step 1: Get the list to a manageable size. 
I already had the URLs in a text file; one per line. However, opening 100 browser tabs at the same time may bog down my system a bit, so I needed to break it into multiple chunks. First, set some variables (yes, PowerShell can use variables from the command line), read the content of the original file, then loop through it producing 5 files (named urls1.txt, urls2.txt, etc.) with 20 URLs in each.

PS \> $lines = 0
PS \> $max = 20
PS \> $num = 1
PS \> get-content urls.txt | foreach{
>> add-content urls$num.txt "$_"
>> $lines ++
>> if($lines -eq $max){
>> $num ++
>> $lines = 0
>> }
>> }
PS \>

The >> symbol indicates that I hit return at the end of the line above. This makes it easier to read than putting everything on a ridiculously long command line. Because the curly brace that begins the ForEach loop hasn't been closed yet, you can just keep going.

Step 2: Open a bunch of browser tabs.
For this we need to create an instance of the Internet Explorer COM object, get the contents of urls1.txt, then loop through the list opening each URL in its own tab.

PS \> $ie = new-object -comobject InternetExplorer.Application
PS \> $ie.visible = $true
PS \> get-content urls1.txt | foreach{$ie.navigate2($_,0x1000)}
PS \>

The ",0x1000" at the end of the navigate2 method call tells IE to open the URL in a new tab. If you leave it out, it will attempt to open each URL in the same tab.

Step 3: Look at each tab with your stinkin' eyeballs.
Sorry. I can't help you with this one. Just don't close the browser when you're done. Close all but one tab.

Step 4: Repeat Steps 2 and 3 for each of the URL files.
Actually, because you left the Internet Explorer window open, you only need to repeat the last command of Step 2. Sure, you could probably spend a lot more time writing a process to watch each tab to be closed then automatically opening another until you've looped through the entire set, but this was supposed to be a quick and easy way to get this done.

* Note:
I did all this from the PowerShell command line, but you could make it a script if you prefer.

No comments:

Post a Comment

One rule: Don't be a jerk.

You can correct me or other commenters all you want, just be cool about it. Of course, I mean "cool" in the nerdiest way possible.