Thank you Andrew. This addresses the "privacy" issue to some extent (along with
dynamic IPs?) As a footnote, I have understood that some web page links have a
more 'active' gleaning of linked data than the 'archived data' in this description
the present records of LANL and similar servers (i.e., I've had a web page give me
back my email address in 'real time' despite my dynamic IP). In that case, such a
'solution' could provide more tools, along with strong policies to enable both
privacy and security.
But the central question to me wa:  Is there potential value-added to authors for
enhanced "services"?  E.g., to have an option to downloaders to be identified to
the authors, and/or to be included in subsequent discussion or follow-up messages
from the authors? (This could even be 'hidden' as a participant may select in a
mailing list, subject to posting to the broadcast list.) Or for an enhanced
capability for a paper to be linked in "cites to" and "cited by" that would
stimulate self-archiving.
If these and other value-added characteristics are "enhanced" in self-archive
lists, they may provided additional impetus to authors to prefer to self-archive,
as advocated by Harnad and others.
Regards, Jim Muckerheide
=======================
Andrew Odlyzko wrote:
> A quick response to the messages from Jim Muckerheide and Fytton Rowland:
>
> All servers that I am aware of do maintain a record of download addresses.
> This does present serious privacy issues, and as a result there are very
> few servers that make their logs widely available.
>
> To answer another part of the question, server logs would be of very limited
> use in producing "discussion lists" and the like.  The reason is that these
> logs are not as informative as one would like for such purposes (which is
> a relief to many privacy advocates and a hindrance to direct marketers and
> the like).  What server logs do is record the IP address of the machine
> that requested a page, and this address looks like 135.207.225.12.  One
> can then use "reverse DNS lookup" to try to find out what machine that is.
> Here is where the serious problems start.  Quite a few such lookups fail,
> and no information is generated about the IP address.  (One can then try
> to do other things, such as examine registries of autonomous systems, etc.,
> but even that is of limited use, and let's skip it.)  When the lookup
> succeeds, you get information that varies in its utility.  Some of the
> addresses will be of the form
>
>    john-smith-pc_at_physics.harvard.edu
>
> which suggests the request came from John Smith's PC in the Harvard Physics
> Dept.  (But even that is not certain, since this PC may have been passed on
> to a student of John Smith.)  Others, such as
>
>    156.cambridge-06-07rs.ma.dial-access.att.net
>
> will tell you the request came from a dial-in customer of the AT&T WorldNet
> ISP business, and that the modem bank is located in Cambridge, Mass.
> It won't tell you who was using that PC, though.  (For that you would need
> to access the WorldNet logs, which are carefully guarded for privacy reasons.)
> The next time you see that address, a different person might be using it.
> Next, many requests come from addresses that look like
>
>    proxy1.questnet.net.au
>
> which are proxies that hide any number of users behind them.  None of these
> entries produce valid email addresses.
>
> One of the complications in studying server logs is that you can never be
> certain you have seen all accesses to a page.  For example, if many people
> are going through proxy1.questnet.net.au to access your pages, this proxy
> will almost certainly cache (store a local copy) at least some of those
> pages, and then deliver them to requesters without leaving any trace
> on your server.
>
> All these technical difficulties make it hard to evaluate usage in a
> meaningful way.
>
> Andrew Odlyzko
>
> ************************************************************************
> Andrew Odlyzko                                      amo_at_research.att.com
> AT&T Labs - Research                                voice:  973-360-8410
> http://www.research.att.com/~amo                    fax:    973-360-8178
> ************************************************************************
Received on Wed Feb 10 1999 - 19:17:43 GMT