Locating and Evaluating Information on the Internet


   Tom O'Haver (toh@umd.edu) and Mary O'Haver (ohaverma@verizon.net)           
http://www.wam.umd.edu/~toh http://www.wam.umd.edu/~toh/mary.html
August, 2005

Most of the material that is published on the Internet is available electronically in its entirety. There are several forms-based "search engines" that can perform keyword searches of the whole text of all of this material globally. One of the best search engines is Google (http://www.google.com/); good alternatives are Altavista (http://www.altavista.com/) and All the Web (http://www.alltheweb.com ). All of these search engines return a ranked list of "hot links"; simply click on the items that you want to view. Note that, because these search engines perform whole-text searches, they can potentially return a very large number of hits, including many unimportant ones, so you should select your keywords carefully.

All search engines have a rectangular box where you type in one or more keywords that describe the topic or item you wish to find. You then click on the "search" button. After a few seconds, the search engine will display a "hit" list giving the title, Web address ("URL"), and short excerpt from the page that it found. Each hit has a blue-colored underlined "hyperlink" to the on-line source of that information. If the item sounds useful, click on the hyperlink to go to that source. To return to the list of hits, click on the Back button at the top left of the Web browser window.

It's important to realize that Web search engines can't find everything that is on the Internet. That's because some Web pages are subscriber-only and require a log-in and password, some Web servers are private and are blocked from outside access, and some information is contained in databases that can only be searched by specific search engines for that database. Search engines differ somewhat in what they cover (i.e. Google searches PDF files but some other search engines do not), so be sure to try more than one engine.

Spelling counts. When typing in search terms in a search engine, watch out for spelling. Small spelling differences can make a big difference. For example, searching on "apple" yields mostly information about Apple computers, whereas searching on "apples" yields mostly information about apples that grow on trees. (Similarly, "macintosh" yields hits on the Macintosh computer and "mcintosh" yields hits on the apple variety). Search engines will dutifully search for misspelled words (you will get hits on pages that also misspelled the word the same way!). Google tries to help by placing a "Did you mean:______" at the top of the search results if it suspects that your search term is misspelled.

What if I get too many hits? A common problem in using Web search engines is that you get too many hits. Since you are probably only going to be able to look at just the first few pages of search results, you run the risk of missing something good that might appear on, say, the 12th page of hits. A related problem is that many of the hits may not really be what you wanted after all. For example, say you wanted to find science lesson plans for a first-grade class. Here are several ways you might perform that search.

Narrowing down a search on Google
Search terms Number of hits on January 15, 2004 What it finds
science lesson 2,730,000 any pages containing BOTH of the words "science" and "lesson", together or separately Perform this search now
science lesson plan "first grade" 25,700 only pages containing ALL of the words "science", "lesson", "plan" and the phrase "first grade"Perform this search now
science lesson plan "1st grade" 6,930 only pages containing ALL of the words "science", "lesson", "plan" and the phrase "1st grade"Perform this search now
science lesson plan "1st grade" OR "first grade" 30,100 only pages containing ALL of the words "science", "lesson", "plan" and either the phrase "1st grade" or "first grade"Perform this search now

As you can see, the more specific you are in your selection of search terms, the fewer and more relevant the hits will be. Including "first grade" really narrows down the search dramatically and focuses more on what you want. But not everyone writes "first grade" - some people might write "1st grade" - so you have to try both alternatives. Even then, there is no guarantee that all the search hits will actually be science lesson plans for a first-grade class; they will simply be Web sites that contain the words "science", "lesson", "plan" and either the phrase "1st grade" or "first grade". The reality is that no single search will return ALL the first-grade science lesson plans on the Internet, and nothing else, simply because those pages do not all contain words and phrases in common that are not used by any other pages. Remember: search engines look for words and phrases, not concepts.

Other search engines may behave slightly differently and may return different hits; so it's a good idea to try more than one search engine if you are having trouble finding what you want.

Another way of reducing the number of unwanted hits is to use a minus sign (-) in front of a search term to exclude any pages that have that word. For example, if you are looking for information on apples (the fruit), but you want to avoid pages on Apple computers, searching on apples -computer will find pages that have the word "apples" but do not have the word "computer".

When choosing search terms, try to use language that is common in the field you are working in. For example, if you are looking for worksheets on various subjects for your students to work on, it's useful to know that many teacher Web sites call then "printables". Searching on "worksheets" gives hits to all sorts of worksheets; searching on "printables" will yield fewer hits, but almost all to school-type material.

If you are too specific, you may miss some useful things because they didn't include the exact words that you used in your search terms. For example, if you are looking for a lesson plan on fractal mathematics for middle school students, you'd miss a good one (at http://math.rice.edu/~lanius/frac/) if you include "lesson plan" in your search terms, because this page doesn't use the term "lesson plan". Searching on fractal "middle school" will find it easily (it's the first one listed). (Searching on "fractal" alone would yield too many hits to research articles and other specialized stuff).

Objective: find lesson plans on fractal math for middle school students on Google.
Search terms Number of pages found
on January 15, 2004
fractal 1,190,000 Perform this search now
fractal "lesson plan" 2,380 Perform this search now
fractal "middle school" 7,830 Perform this search now

The moral: try to think what words and phrases a teacher is likely to use on the kind of pages that you are looking for.

What if I get too few hits? You may have misspelled or mistyped one of the search terms. Check carefully and try alternative spellings. Don't use too restrictive a search term. For example, if you are interested in information about the clay storyteller dolls of the type made by the Pueblo Indians, and you search on the phrase "pueblo clay storyteller dolls", you will get only 1 hit (to this document!), because no one uses this exact phrase. But if you try to search on storyteller alone, you'll get many thousands of hits, most to other meanings of the word "storyteller". Rather, try searching on pueblo clay storyteller dolls or the phrases "storyteller dolls" or "storyteller figures". These searches will give more relevant hits. Remember: searches are fast and free, so try several searches using different combinations and alternatives of search terms, rather than settling for the first thing you come up with.

Objective: Find information about Pueblo clay storyteller dolls on Google.
Search terms Number of pages found in January 15, 2004
"pueblo clay storyteller dolls" 1 Perform this search now
storyteller 719,000 Perform this search now
pueblo clay storyteller dolls 7,590 Perform this search now
"storyteller dolls" 1,130 Perform this search now
"storyteller figures" 224 Perform this search now

If I get lots of hits, which ones are displayed at the top of the list? What determines the order of listing? If you perform a search and get thousands of hits, most search engines will show only the first 10 hits on the first page, with the other hits on subsequent pages. You are much more likely to look at hits on the first page or on the first few pages. So the "ranking criteria" are actually just as important in practice as the search terms themselves, especially when the total number of hits is very large.

Search engines are now pretty sophisticated in their ranking criteria, so that even if you do get thousands of hits, you may find that the first page or two of hits really do represent the "best" material. For example, Google gives a higher rank (displays nearer the top of the list) to (a) long pages that have lots of full sentences;(b) pages with lots of links to pages that have related content; (c) pages that have many other high-ranked pages linking to it; and (d) pages that have the search terms in the title, on the first page, or in the "anchor" text associated with hyperlinks to that page from other pages. Does that mean that the "best" hits for your purposes will be at the top of the list? That is the hope of the search engine designers, but obviously the search engine can not read your mind and does not know your real needs. Sometime it works pretty well and sometimes not so well. Sometimes you just have to be clever with your search terms.

It's important to realize that different search engines use somewhat different search and ranking criteria, so that you won't always get the same search results when using the same search terms on different search engines.

Sponsored links. Search engines are free to use. So how do they survive economically? Most are supported by advertisers. Some search engines sell placement in their search results; that is, a company can pay to have their Web sites come up higher in the search results for certain search terms. In most cases (e.g., Altavista and All the Web), these sponsored links are labeled. But Google does even better - its sponsored links are not even included in the search results, but are listed on the right in a sidebar.

Search for the content, not the title. If you're having trouble finding something because you keep getting irrelevant hits that simply mention your search terms, but are not the actual thing you want, try searching for words or phrases that are likely to be found within the desired document, rather than simply searching on its name or subject. Remember: Web search engines perform "whole text" searches - that is, they search the entire contents of Web pages and documents, not just their titles.

Searching for images, sounds, and video. The Google image search is a fun way to search for images. From the Google main page, click on the Image tab, type in keywords, and click Search . The hit results are illustrated with thumbnail previews (little postage-stamp sized images) of each image it finds. Click on a thumbnail to go to the page that has that image. Also try the Altavista image search and AlltheWeb image search.

If you're looking for sounds, try the Altavista audio search and AlltheWeb audio search. Or try a regular Google search with wav OR mp3 added to your search terms (two popular audio file formats that are often used for sound files on the Web).

If you're looking for video clips, try the Altavista video search and AlltheWeb video search. Or try a regular Google search, adding mpg OR avi OR mov to your search terms (three popular video file formats that are often used for video files on the Web).

Capitalize on other people's efforts. Here's a tactic that can save you time and can often locate material that you might not find on your own: When you find a page on a topic you're interested in, see if it has links to other pages on that topic. Very often the author of a page will have already spent a good bit of time locating and sorting through other sources of information, possibly including some sources that you might not have found by yourself.

Using the Web with students. There are several search engines that are specially developed for students. These include Yahooligans (for students age 7-12), KidsClick! which gives the reading level of each site listed, AOL@SCHOOL, Ask Jeeves for Kids , and Education World. Many general-purpose search engines have a "filter" that eliminates hits on adult content; Google's SafeSearch filter can be set to three levels of strictness: click on Preferences, then scroll to SafeSearch filtering. Dogpile also has an adult filter that can be turned off and on. Altavista has a family filter with password protection.

Categorized indices of World Wide Web sites. An alternative to keyword-based searching are the categorized indices of Web sites that many groups have constructed. There are many such lists of school curriculum- related materials, such as Blue Web'n (http://www.kn.pacbell.com/wired/bluewebn/), Sites for Teachers (http://www.sitesforteachers.com/), Cool Teaching Lessons and Units (http://www.coollessons.org/), and Internet Sites Which Support Instruction . The Maryland Collaborative for Teacher Preparation has collected an extensive index of annotated science and math education sites, available at http://www.towson.edu/csme/mctp/Technology/MCTP_WWW_Bookmarks.html. Many search engines also have a categorized index of selected sites, for example: Yahoo (http://www.yahoo.com/), Altavista (http://www.altavista.com/dir/default), and Google (http://directory.google.com/). These indices contain sites that have been hand selected and are often annotated or reviewed by a knowledgeable person or committee. Compared to the results of a general-purpose search engine, the indices are less comprehensive and up-to-date, but they have the advantage of not having so many poor-quality or irrelevant sites.

Some other useful information sources Where to find it
New York Times Newsroom Homepage http://tech.nytimes.com/top/news/technology/cybertimesnavigator/index.html/
Bibliographic citations of books (Library of Congress) http://lcweb.loc.gov/z3950/gateway.html
Amazon.com's "Search Inside the Book" http://www.amazon.com/books
Bibliographic citations and summaries from 26,473 publications http://www.ingenta.com/
U.S. patents and trademarks http://www.uspto.gov/
Categorized index of World Wide Web sites http://www.yahoo.com/
Online encyclopedia articles http://www.britannica.com/ or
Articles in the Washington Post newspaper http://www.washingtonpost.com/

Evaluating the information that you find. So you have found a site or document that seems to be useful? How reliable is that information likely to be? Just because it's on the 'net does not mean it's correct or reliable. Unfortunately, some of the information on the Web is inaccurate, biased, out-of-date, shallow, and inappropriate for academic use. On the other hand, there are also vast amounts of invaluable information, books, data collections, encyclopedias, libraries, some of it available in no other form. Lida Larsen, an information specialist at the University of Maryland, recommends that the following factors be kept in mind when using Web resources: scope, authority and bias, accuracy, timeliness, permanence, value-added features, and presentation.

It certainly helps to know who wrote or created that information and what institution they represent. If no one is willing to put their name to the document, it arouses suspicion. On the other hand, if an official government agency or accredited education institution publishes information and clearly labels it as an official publication, then it is likely to be more reliable. But every organization has their own goals and agendas, and that may create a bias (which may be overt or subtle) in their point of view. Obviously a commercial site which has something to sell can be expected to promote their products or services over their competitors. A independent third-party evaluator may be fairer. Religious and political organizations usually have strongly-held points of view. Most documents do not state plainly their real goals and hidden agendas - it's up to you to figure that out.

Ideally, every document should carry an author, affiliation, and a date. For example, the document you are now reading is clearly labeled with the author's name and affiliation, and the date. There are also links to the author's personal home page and to his email address.

But many pages are not so well labeled. For example, go to http://curry.edschool.virginia.edu/curry/class/Museums/Teacher_Guide/Science/home.html. I came across this page in running a search for "science lesson plans". You can see why the search engine hit on this page, but you can't tell anything about who created it, or when, or who they work for. However, in this case there is a link, labeled "HOME", that takes you back to the "home" page for this site, and this home page does tell you which organization developed this site and when. (You could also probably guess from the URL that this came from the School of Education at the University of Virgina).

But suppose you find a page that has no link back to a home page. For example, look at http://www.col-ed.org/cur/sci/sci01.txt . You can see that this is a lesson plan for a pinhole camera, written by Patricia Willett. But there is no link on this page to a home page that might take you to other lesson plans like this (if there are more) or that might tell you something about the organization sponsoring this information. What can you do in that case? Here is a trick that will often locate the home page for any page that you find on the Internet: "back-up" through the URL (by deleting from the end of the URL to the next / forward slash) until connection occurs, then look at the page for clues. This technique is also useful when you can't tell who is responsible for a page. For example:

  1. Go up to the address field at the top of the browser window, where the http:// address of the page is displayed.
  2. Edit the address there by selecting and deleting the last part of the address, the part after the last slash (sci01.txt). That is, drag the mouse over sci01.txt and press the delete key. This leaves http://www.col-ed.org/cur/sci/ in the location field.
  3. Press the RETURN key. This activates the truncated address, which in this particular case is a blank page (this sometime happens - don't give up).
  4. Keep on deleting the next segment of the address. When you get to http://www.col-ed.org/cur/, you'll see a page that has links to various lesson plans and which has a link to the home page at the bottom. Click on the link to the home page; from the home page you can see that this site comes from the Columbia Education Center in Portland, OR.

What to do with the links that you find? How can you save the best links that you find for later use on other computers? If you simply use the "Save bookmark" or "Add to favorites" commands, the link will be saved only on the computer you are using. But what if you want to "take the links with you"?

Here's a simple way to save links. Stick in a floppy disk and open it up (My Computer --> 3 1/2 floppy (A:)). Every time you find a good site, simply click on the little icon to the left of the Address (Location) bar at the top of the Web browser's screen and drag it over to the floppy disk window. Presto - it creates am Internet shortcut with the address and the title of that Web page. Later, you can transfer that diskette to any other Internet-connected computer and simply double-click on the shortcut icon to launch a Web browser and go to that page.

Here's another way: open up a blank document using Notepad (Windows) or SimpleText (Macintosh). Now, every time you find a good site, select (drag over the address in the Address (Location) bar, Copy it, and Paste it into the text document. When you are finished, save the document onto a floppy diskette. You can open up that document on any other computer, Copy an address, and Paste it into the Web browser's Address (Location) bar at the top of the screen and press Enter.

What's the best way to distribute sets of links to students? There are several Web sites that allow any teacher to set up a Web page containing links that students can access from any Internet-connected computer at school or at home. You don't have to photocopy handouts and your students won't have to type in those tedious and cryptic URLs. Go to TeacherWeb (http://teacherweb.com) or SchoolNotes (http://www.schoolnotes.com/).

Some Useful Resources

Basic Internet Searching Online Tutorial

Advanced Internet Searching Online Tutorial

Choose the Best Search for Your Information Needs

Search Tools for Students

The Five-Step Search Strategy

Teaching Searching Strategies

Search Engine Showdown: The Users' Guide to Web Searching

Yahooligans! Teachers' Guide

WWW Search Strategies

Citing Electronic Resources