Tutorial 1: Making a Network Diagram with SocSciBot 4

Overview

This describes the simplest way to use SocSciBot 4 to create a network diagram of the hyperlinks between a collection of web sites.

**If you have a set of blogs or web sites to crawl, please copy these instructions but use your URLs instead of the italic URLs below.**

Step 0: Install SocSciBot 4

  1. Go to the SocSciBot web site http://socscibot.wlv.ac.uk/ and follow the link to download SocSciBot 4 if you agree with the conditions of use. Choose a place to save SocSciBot 4 to where you have plenty of storage space to save data.

Step 1: Crawl your sites

  1. SocSciBot works in two stages: first it crawls a set of web sites, then it Start SocSciBot 4 by double clicking on the file called either SocSciBot4 or SocSciBot4.exe where you saved it on your computer. This should produce a dialog box similar to the one below. This only happens the first time you start SocSciBot.
  2. Confirm that the folder chosen by SocSciBot 4 to store your data is acceptable by clicking OK. Also enter your correct email address. It will be used to email the webmasters of any sites that you crawl. This is both ethical practice and may save you from getting into trouble if a webmaster is unhappy with you crawling their site - they can email you directly instead of emailing your boss or network manager. You can also enter a message to be included in the email to give the purpose of the crawl. You may wish to include the URL of a page with additional information about your project. Also, answer any questions about the location of Microsoft Excel and Pajek - you can say NO to both of these.
  3. Enter test as the name of the project at the bottom of the next dialog box, Wizard Step 1, and then click on the start new project button. All crawls are grouped together into projects. This allows you to have different named groups of crawls which are analysed separately.
  4. In the Wizard Step 2 dialog box tick Download multiple sites/URLs in one combined crawl and click the Crawl Site with SocSciBot button.
  5. You will see the main multiple crawls screen. Check the Crawl web sites to a maximum depth option.

Step 2: Create the network diagram

  1. Start up SocSciBot Tools by double clicking on the SocSciBot4 or SocSciBot4.exe file again. This should take you straight through to Wizard step 1. Click on test to select this project to analyse, exactly as you have done twice before.
  2. Select Analyse LINKS in Project with SocSciBot Tools from the Wizard Step 2 to start the link analysis process.
  3. You will be asked if you want to calculate the link analysis reports for the project (the three web sites crawled). Answer Yes to this question.
  4. Next you will be asked if you want to standardise home page file names in your data. This improves the results by treating different versions of a web site home page as the same for the analysis. Click Yes standardise home page file names and then wait a few seconds for the reports to be calculated..
  5. After a few seconds, the reports will have been calculated and you can view them using the tabbed sections in the lower half of the screen. To see the network, click the Show Site Network button.
  6. You should now see the network below (perhaps arranged differently). All these web sites link to each other so there are arrows between them all.

Rearranging the network: You can move the nodes around to rearrange the network or right click on a node to get a list of properties that you can change. Please experiment with the right click menu and other menu options to see how they work. For large networks, please try the Automatic option in the Layout menu. More information about the network drawing tool is here.

THE END

You have now finished! Try the above again for your own set of web sites or continue below for more information (optional).

Extra information about the web site network

Notes

The steps of this tutorial apply equally for small and large projects. The only difference is that for a large project, it may take a significant time for the site crawls and for SocSciBot Tools and Cyclist to process the data. Extra information about features specific to large projects.