View
By section
As outline
Fully expanded
FAQ sections
Getting started
Effective use
General questions
Authoring & Posting
Forums and FAQs
Rocket science
Info for admins
Mail admin
Infrastructure
Calendars
Intellectual Property
Questions
Broken pages
Remote subsites
Customize HTML
CVS authoring
CVS update
CVS info
Finding Bad Links
Download Logging
|
| |
How do I find bad links in my group web page Christopher Brooks, 24 Sep 2001 Last updated: 3 Sep 2003
The search engine gets run every night and
generates a list of bad links
in http://www.gigascale.org/gsrc/private/9.html
that can be viewed only by GSRC members.
You can also use the wget command,
but you will need to set it up to use the cookie
file from Mozilla.
- Install wget
- Log in to the website using Mozilla and then
exit Mozilla
- Find your cookie file. Mine was at
c:/Documents and Settings/cxh/Application Data/Mozilla/Profiles/default/lwhpscha.slt/cookies.txt
- Copy the
cookies.txt file
to a place with a shorter name.
- Run wget:
wget -r --load-cookies cookies.txt -np http://www.gigascale.org/yourGroup
This will produce a directory called www.gigascale.org that contains the
contents of yourGroup
- Look for
Not found in the output
- If you find a file that was not found, then
grep the files for that file. For example, if
foo.htm was not found, we would do
find . -name "*.htm" -print > /tmp/files
grep foo.htm `cat /tmp/files`
| |
|