Searching \ for '[OT] Google rankings / was / Ham mailing list reco' in subject line. ()
Make payments with PayPal - it's fast, free and secure! Help us get a faster server
FAQ page: www.piclist.com/techref/index.htm?key=google+rankings
Search entire site for: 'Google rankings / was / Ham mailing list reco'.

Exact match. Not showing close matches.
PICList Thread
'[OT] Google rankings / was / Ham mailing list reco'
2007\05\08@084148 by Jim Franklin

flavicon
face
I disagree with that, do a gurgle for  

Horsebox Conversion

(on the .co.uk site at least - never checked the .com)

and my site comes top, (Jim and Beth's horsebox conversion) and has done so
for a couple of years.
I have never paid to list it, and never submitted it to any engines. Simply
gets there by magic or whatever. (actually, i believe it is because the site
links to a number of other places and gets spidered, but magic sounds
better).

Jim


On Tue, 8 May 2007 05:58:40 +0000 (UTC), Peter P. wrote
{Quote hidden}

> --

2007\05\08@105047 by Harold Hallikainen

face
flavicon
face
I don't think Google (like some others) accepts payment for search
ranking. They do auction off ad positions, but these paid ads are clearly
marked as "sponsored links." I think search position is largely based on
the number of relevant links to a page (not from link farms). For example,
my page on FCC rules comes in number 4 out of 1,810,000 (see
http://www.google.com/search?q=fcc+rules ).

Harold


{Quote hidden}

>> --

2007\05\08@121729 by Peter P.

picon face
Harold Hallikainen <harold <at> hallikainen.org> writes:

> I don't think Google (like some others) accepts payment for search
> ranking. They do auction off ad positions, but these paid ads are clearly
> marked as "sponsored links." I think search position is largely based on
...

> > I disagree with that, do a gurgle for
> >
> > Horsebox Conversion
...

I think that there are a lot of exceptions that confirm the rule. The rule is
that getting a site with 'usual' content into a reasonable position is nearly
impossible without 'magic' involving Ben Franklin effigies changing hands. There
is nothing sinister about it (but I'd still want to know why those ieee.org
'sell' pages get ranked so high without having content accessible to usual
browsers w/o paying).

This is not Google's work, it is the work of optimization done by the other
100-150 punters who are ahead. Whathever algorythm Google uses for ranking, as
long as it is not perfectly random, punters will keep trying things until their
pages percolate upwards in the rankings. That's the point. And this tweaking is
what can put one in 'Google hell' (it's really called that) if it goes too far.
So there is a non-linear invisible trapdoor as one advances in site tweaks. The
people who know how far to go cost money (apart from those who go too far and
put their client's sites in Google Hell for a while - I just read about that
recently). So, unless one is well-linked to (from sites already indexed - i.e.
not from pages not yet indexed), it is very hard to 'percolate' up. The fact
that money changes hands is undoubtable. What is not clear is where the dollar
stops. Since Google does not directly sell ranking, the dollar seems to stop at
the optimizers. But it does get out of the customer's pocket first. So, one can
say that it costs money to get ranked unless one has some really desirable
commodity.

Peter P.


2007\05\08@133036 by wouter van ooijen

face picon face
> I think that there are a lot of exceptions that confirm the
> rule. The rule is that getting a site with 'usual' content
> into a reasonable position is nearly impossible without
> 'magic' involving Ben Franklin effigies changing hands.

A 'nearly' claim is difficult to debunk, but my site (http://www.voti.nl) seems
to end up quite satisfactory in google searches. I did not pay anyone
for that.

IMHO there is only one real way to end up high on google: (relevant)
content.

> So, one can say that it costs money to get ranked
> unless one has some really desirable commodity.

Correct, and rightly so. That commodity is content.

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\08@144159 by Gerhard Fiedler

picon face
wouter van ooijen wrote:

> IMHO there is only one real way to end up high on google: (relevant)
> content.

There's still the question how ieee.org et al get their high ranks, how
they get in the listing at all. I don't see any (public) content at their
links. I don't know what they do, but fact seems to be that I don't get to
parse the content like Google does. I'm not sure why Google would list
content that almost nobody can access without paying. This looks very much
like advertising to me -- and no company advertises for some other company
for free.

At the very least, they need an option to exclude content that's not
public. But I guess they have their reasons why they don't have this
option...

Gerhard

2007\05\08@153542 by Peter P.

picon face
wouter van ooijen <wouter <at> voti.nl> writes:

> > I think that there are a lot of exceptions that confirm the
> > rule. The rule is that getting a site with 'usual' content
> > into a reasonable position is nearly impossible without
> > 'magic' involving Ben Franklin effigies changing hands.
>
> A 'nearly' claim is difficult to debunk, but my site (http://www.voti.nl) seems
> to end up quite satisfactory in google searches. I did not pay anyone
> for that.

Nearly means that from about 20 people I interviewed/queried about this over
some time about 3 or 4 said they had no trouble being ranked w/o effort. My own
homepages appeared 'by themselves' over time (I did not bother to register
them), but a commercial site took more than two months to 'move' (and it was the
kind with paid adwords and all that).

> IMHO there is only one real way to end up high on google: (relevant)
> content.

Relevant to what ? To a tech search for parts ? Do you compare the content of
your site with that of Mouser, Newark, National, etc ? Surely you can see that
those sites are more relevant,and for content (as in tech articles), there are a
lot more out there. Having a unique name helps, however.

> > So, one can say that it costs money to get ranked
> > unless one has some really desirable commodity.
>
> Correct, and rightly so. That commodity is content.

Everyone has at least as much content as you have. Do a query on usual terms and
see what comes up in the word rankings (for each word in the search). Having a
lot of links to your page from other places on the web should help. F.ex. I
don't know how Google handles signatures on archived discussion lists which
invariably contain URLs in .sigs . With a regular poster there should be tens of
thousands of individual archive pages, each different, each hosted on Google
proper, and each containing a link to the business URL. Does it matter ? I don't
know. Example: Piclist archive on Google Groups (read only):

 http://groups.google.com/group/piclist_archive?lnk=sg&hl=en

Searching for voti.nl in the search window on that page brings in 218 results!
But the Google score for voti.nl is over 16,000 pages. Do the 218 extras count ?
I have no way to know that.

Anyway this is not a guessing game, there is a system behind it and it's a
system that makes money. So finding excuses for that is imho superfluous. It's
just the way it is.

Google ranks pages by linked-to. IOW, *other* sites actually rank your pages, by
linking to them. And one of the biggest 'optimization' scams that is hard to
beat is to ask people to link to your page and you link to them in return. Users
don't count. Google has no way to know people clicked on a certain result link
in a Google answer page. There is no feedback mechanism built into the links
proper (but some statistical correlation may exist wrt. those who click on the
cache options of links).

So ranking is not by users but by *peers* (and, strangely, users who post on
newsgroups or blogs *become* peers, since they generate 'content'). Other sites,
not visitors or search keys, determine your ranking, by linking to you. That
could also explain why ieee.org etc ranks high: because everyone and their
sisters link to the publications (presumably from other papers which *are*
searchable - usually papers have a reference list at the bottom and if that uses
HTML links it would explain the 'popularity' of those ieee.org $$$pay pages -
but not how Google got to index the content).

I was involved with deploying (and modifying a little) non-commercial search
engines which use 'democratic' searches (i.e. term occurence count indicates
relevance), namely htdig. Over about 500MB of technical documents, PDFs and
source code, htdig could find what one was looking for ... eventually. It could
be on the 20th result page. Looking for a word that could appear several times
in a technical description, while the real definition was elsewhere was a
particularly bad idea there. So it is not that simple. 'Content' does not work
the way you seem to think it does, there is a lot of 'something else' involved.
Google says as much about it, but this discussion is about the something turning
slowly but inexorably ultra-commercial (with some exceptions). And not all of it
seems to be Google's doing (i.e. page rank 'optimizers' pitch in hard here).

Peter P.


2007\05\08@162343 by wouter van ooijen

face picon face
> Relevant to what ?

of course: relevant as judged by google. which (by rumours) is based on
links to your site. such links are generated by other sites (=people)
who think that your content is relevant. so in the end: relevant as
judged by public vote.

> Do you
> compare the content of your site with that of Mouser, Newark,
> National, etc ?

Don't tell them, but most of them go to extremes to make their sites
google-unfriendly. so they end low or do not appear at all.

> Having a unique name helps, however.

providing relevant search terms is of course important. AFAIK google
ranks on search terms and body text first, and 'google rank' (= roughly
the number of extrenal link to your site) kicks in only to break ties.

>
> > > So, one can say that it costs money to get ranked
> > > unless one has some really desirable commodity.
> >
> > Correct, and rightly so. That commodity is content.
>
> Everyone has at least as much content as you have.

*relevant* content. Personaly I don't know that everyone guy, but I
doubt he has as much relevant PIC info on his site as I have.

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\08@162839 by Paul Hutchinson

picon face
> -----Original Message-----
> From: spam_OUTpiclist-bouncesTakeThisOuTspammit.edu On Behalf Of Peter P.
> Sent: Tuesday, May 08, 2007 12:12 PM
>
> I think that there are a lot of exceptions that confirm the rule. The rule
is
> that getting a site with 'usual' content into a reasonable position is
nearly
> impossible without 'magic' involving Ben Franklin effigies changing hands.
<snip>

Would you please provide a search term(s) that produces a Google result that
illustrates your point.

Paul

>
> Peter P.

2007\05\08@164957 by Peter P.

picon face
wouter van ooijen <wouter <at> voti.nl> writes:

>
> > Relevant to what ?
>
> of course: relevant as judged by google. which (by rumours) is based on
> links to your site. such links are generated by other sites (=people)
> who think that your content is relevant. so in the end: relevant as
> judged by public vote.

Links generated by 'other sites' != 'links generated by visitors'. The visitors
who 'generate links' are not ordinary visitors. They are 'peers' (people who
create content - not buyers or visitors).

> > > > So, one can say that it costs money to get ranked
> > > > unless one has some really desirable commodity.
> > >
> > > Correct, and rightly so. That commodity is content.
> >
> > Everyone has at least as much content as you have.
>
> *relevant* content. Personaly I don't know that everyone guy, but I
> doubt he has as much relevant PIC info on his site as I have.

The everyone guy is like the [Any key] key. Except that there are a LOT of them.
I think that you overestimate yourself a little. F.ex. for this key:

 http://www.google.com/search?q=pic+programmer+microprocessor&hl=en&start=60

You appear on page 7, and with a FAQ, not with a product. For PIC programmer
only you appear on page 3.

Anyway do not underestimate the 16,000 pages that link to you imho.

Peter P.


2007\05\09@021031 by wouter van ooijen

face picon face
> Links generated by 'other sites' != 'links generated by
> visitors'.

Of course, vistors don't generate links. Links are made by other people
who think yout site is worth linking to.

> > > Everyone has at least as much content as you have.
> >
> > *relevant* content. Personaly I don't know that everyone guy, but I
> > doubt he has as much relevant PIC info on his site as I have.
>
> The everyone guy is like the [Any key] key. Except that there
> are a LOT of them. I think that you overestimate yourself a
> little. F.ex. for this key:
>
www.google.com/search?q=pic+programmer+microprocessor&hl=en&start
=60
> You appear on page 7, and with a FAQ, not with a product. For PIC
programmer only you appear on page 3.

Which would be only wrong if there was a good reason that I would be
higher. On the search terms you specify, do you think I should be
higher? I don't.

> Anyway do not underestimate the 16,000 pages that link to you imho.

Why do you think I underestimate those?

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\09@023756 by Pete

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, May 09, 2007 at 08:10:25AM +0200, wouter van ooijen wrote:
> > Links generated by 'other sites' != 'links generated by
> > visitors'.
>
> Of course, vistors don't generate links. Links are made by other people
> who think yout site is worth linking to.

Also note that many wiki's, wikipedia included, and discussion boards
use what is known as the "nofollow" tag on links. Basically any external
links in user-submitted content have that tag added which instructs
search engines to ignore that link for search purposes.

This was added after spammers started filling message boards and other
stuff full of links to their crummy online gambling sites...

- --
http://petertodd.ca
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGQWwy3bMhDbI9xWQRAgfLAKCqfl3QAVWJU9s3qJBh7+aAYXFgqgCeMOpD
1TBk9zpheic6Pj4TFo7yCYs=
=Xqwg
-----END PGP SIGNATURE-----

2007\05\09@052019 by peter green

flavicon
face


{Quote hidden}

but many don't.

one way to get rating for your site that i've used before is to go to every forum you've ever posted on (easier if you use the same name on all of them) and put your website in your profile.



2007\05\09@054023 by Peter P.

picon face
Paul Hutchinson <paullhutchinson <at> yahoo.com> writes:

> Would you please provide a search term(s) that produces a Google result that
> illustrates your point.

Sorry for the late answer. I cannot give a direct query, but here is an
alternate way:

- enter some search terms in Google
- take the websites for the first two pages of results and search each
- note the order of the websites (which should reflect ranking), and the link
count for the search on those websites (which does not)

Of course this is far from perfect, but it busts the myth about 'relevant
results sorted by ranking determined by link count'.

Peter P.

2007\05\09@061038 by Peter P.

picon face
wouter van ooijen <wouter <at> voti.nl> writes:
> > Anyway do not underestimate the 16,000 pages that link to you imho.
>
> Why do you think I underestimate those?

In the context of your ranking on the 3rd or 7th page with such simple search
keys, it is likely that the 16,000 links count a lot more than the search keys
proper (which should appear on a few millions of pages at least). So the links
may matter more than the search keys for certain weights thereof in the
algorythm (just guessing here).

Peter P.


2007\05\09@063650 by Jake Anderson

flavicon
face
Peter P. wrote:
{Quote hidden}

Don't forget that google also rates the links from sites too.
IE a link from wikipedia is worth more than a link from hawtXXX
gambeling and warex.com

2007\05\09@064602 by Tony Smith

picon face
> Paul Hutchinson <paullhutchinson <at> yahoo.com> writes:
>
> > Would you please provide a search term(s) that produces a Google
> > result that illustrates your point.
>
> Sorry for the late answer. I cannot give a direct query, but
> here is an alternate way:
>
> - enter some search terms in Google
> - take the websites for the first two pages of results and search each
> - note the order of the websites (which should reflect
> ranking), and the link count for the search on those websites
> (which does not)
>
> Of course this is far from perfect, but it busts the myth
> about 'relevant results sorted by ranking determined by link count'.
>
> Peter P.


What myth?

Google haven't done that for yonks.  Your ranking is determined by links,
where the links come from, if you link back, content, age of content, site
updates, lack of 'tricks', easy navigation, do your ads get clicked etc.
Says as much here
<http://www.google.com/support/webmasters/bin/answer.py?answer=35769>.

No doubt you've been reading this -
<http://googlewebmastercentral.blogspot.com/> after adding this -
<http://www.google.com/support/webmasters/> to your bookmarks.  

An answer other than "of course they'd say that" would be nice.

If I want to 'blink a led', I'm glad Wouter forked over the cash, otherwise
I'd never find it in the 1.5 million pages that show up.  Of course, 'flash
a led' is a bit tougher, either Wouter refused to pay in order to get ahead
of 56 million others, or just maybe he never says 'flash a pic', and no-one
else calls it the 'flash a led' in their links.  Tough call.

This is starting to sound like every other conspiracy nutter thread - "I
can't prove it but I know it's true".

What colour helicopters do Google use?  Stripes?  No use looking for them in
GoogleMaps of course, they'd be photoshopped out.

Tony

2007\05\09@091822 by Jinx

face picon face
> What colour helicopters do Google use?  Stripes?  No use looking
> for them in GoogleMaps of course, they'd be photoshopped out

I heard that. My niece's boyfriend's cousin works for a cleaner who
does an IT firm and she said....

2007\05\09@095206 by Tony Smith

picon face
> > What colour helicopters do Google use?  Stripes?  No use
> looking for
> > them in GoogleMaps of course, they'd be photoshopped out
>
> I heard that. My niece's boyfriend's cousin works for a
> cleaner who does an IT firm and she said....


Yeah, cleaners get all the dirt...

Tony

2007\05\09@112502 by William Chops Westfield

face picon face
>> Would you please provide a search term(s) that produces
>> a Google result that illustrates your point.
>>
It's been particularly annoying that you USED to be able to
enter a part number of a relatively obscure chip and have the
manufacture's spec sheet show up high in the rankings.  Now,
you wind up with a whole bunch of data sheet subscription
services that want money before they'll feed up the datasheet,
and a bunch of sales sites,  good portion of which don't actually
HAVE the datasheet or the part you requested anyway.  For instance, try
"tmp47p443"...

I don't think this is google being bought, I think it's just
sites that have learned how to manipulate the search engines.

BillW

2007\05\09@120033 by Peter P.

picon face
Jake Anderson <jake <at> vapourforge.com> writes:

> Don't forget that google also rates the links from sites too.
> IE a link from wikipedia is worth more than a link from hawtXXX
> gambeling and warex.com

Yes, it's a 'brothers' scheme. A lot of people know you and you know me so a lot
of people will know me too. A so-called network of trust. If a site trusts you
and puts a link in to your page(s) (positive or negative) then you are better
'known'.

Yet it is a better scheme than a 'democratic' search that goes by relevant word
count alone (like htdig, glimpse etc).

Peter P.


2007\05\09@120813 by Peter P.

picon face
William "Chops" Westfield <westfw <at> mac.com> writes:

> I don't think this is google being bought, I think it's just
> sites that have learned how to manipulate the search engines.

It is not 'sites' it is a specialized skill that requires full time work. And
pay proportional to that. With the exception of gifted and lucky admins who get
the job done the vast majority of the 'high rankers' were 'optimized'.
Unfortunately a lot of 'optimizers' are hoaxes and there is no easy way to tell
them apart. I just read recently that a firm hired an optimizer and that
resulted in the entire site being 'optimized' into Google Hell for some time,
losing millions of hits over weeks and months.

Peter P.


2007\05\09@122804 by Tony Smith

picon face
> >> Would you please provide a search term(s) that produces a Google
> >> result that illustrates your point.
> >>
> It's been particularly annoying that you USED to be able to
> enter a part number of a relatively obscure chip and have the
> manufacture's spec sheet show up high in the rankings.  Now,
> you wind up with a whole bunch of data sheet subscription
> services that want money before they'll feed up the
> datasheet, and a bunch of sales sites,  good portion of which
> don't actually HAVE the datasheet or the part you requested
> anyway.  For instance, try "tmp47p443"...
>
> I don't think this is google being bought, I think it's just
> sites that have learned how to manipulate the search engines.
>
> BillW


Some sites serve up different content depending on who's visiting.

At lot of it is no big deal, media sites lock out non-US visitors, some
switch languages based on IP and then there are changes made depending on
your browser.

Naturally, it's not hard to spot the Google spider, and give it access to an
abridged PDF (ya don't want the lot ending up in the cache!).  AKA 'doorway'
pages.  Google takes a dim view of this, and banishes those it catches.  I'm
not sure how it catches them though, a second spider disguised as a browser?

Flash sites are worse, they never show up.  Then again, they don't deserve
to.

Tony

2007\05\09@125942 by Tony Smith

picon face
> William "Chops" Westfield <westfw <at> mac.com> writes:
>
> > I don't think this is google being bought, I think it's just sites
> > that have learned how to manipulate the search engines.
>
> It is not 'sites' it is a specialized skill that requires
> full time work. And pay proportional to that. With the
> exception of gifted and lucky admins who get the job done the
> vast majority of the 'high rankers' were 'optimized'.
> Unfortunately a lot of 'optimizers' are hoaxes and there is
> no easy way to tell them apart. I just read recently that a
> firm hired an optimizer and that resulted in the entire site
> being 'optimized' into Google Hell for some time, losing
> millions of hits over weeks and months.
>
> Peter P.


That would be the diamond guy (Forbes.com).  Tough luck, buddy, you gambled
and lost.

Sorry, but the people complaining are the very ones who whould like to pay
Google money (after all, they pay the SEO scammers).  The fact the SEOs and
not Google get the cash kinda knocks a hole in your argument.

Google make it clear how to get to the top.  Lots of content, lots of text,
name your pages properly (i.e. not Page1, Page2 etc), update regularly,
attract links, don't swap links, don't do stupid tricks (like the diamond
guy) and so on.

Wouters blink-a-led page is a perfect example of how to do it right.  It
gets updated, simple layout, the content, title & page name are all
'blink-a-led', lots of links to it, no 1 point high white text, no 'Playboy'
in the meta-tags and so on.  That's Googles 'big secret' - don't be an
arsehole.

Complaining that companies like eBay or Amazon get 2-day old pages highly
ranked while yours is at the bottom misses the point.  People WANT ebay,
Amazon etc to show up on top.  A month old eBay is almost worthless.  They
are highly popular sites, update extremely often and - get this - people
click those links in their search results.  If you want to be ranked high,
then get popular.  To get popular, get a reason.

Think of it this way, if I type "US President" into Google, I'll get pages
about George Bush, which is what I expect.  After the 2008 elections, I want
to see results about the new President.  This is despite nearly every "US
President" link on the 'net pointing to George Bush.  Google will rapidly
promote links to the new person, based on web sites it considers
authorities.  These will be 'high turnover' sites, primarily news based.
Once they stop talking about Bush, those old links will drop in the
rankings, and THAT'S WHAT I WANT.  A straight 'popularity' contest, (most
links wins) means George Bush will top the ranking for years, DO NOT WANT.

Wouters site shows up quite a lot in Google searches, and that's because he
has gained a good reputation, i.e. he's an authority.  

While Wouter is the 'blink-a-led' guy, he's not the 'flash-a-led' guy.  If
Google ever figures out that flash & blink are the same thing, then he'll
probably rank high on that search too.  It wouldn't hurt if he changed
'blink' to 'flash' a couple of times.  Since that page is a high ranking
one, it should rank fairly high for all its terms.

It's work a try, it would be interesting to see what happens.

Anyhoo, sorry folks, it means you need to pull you finger out, sit down,
shut up and get to work.

Tony

2007\05\09@135805 by Pete

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, May 10, 2007 at 02:27:39AM +1000, Tony Smith wrote:
{Quote hidden}

It really must be.

Search for any of my pages, and a decent number will have weird crap in
the google results from my "put up the webservers logs" background.

For awhile I had some code that would detect the google spider and
simply disable that stuff. But I noticed that every new page I put up
would work... then about a week or two later that log crap would show up
again.

Google definetely has second spiders.

> Flash sites are worse, they never show up.  Then again, they don't deserve
> to.

I agree %100...

Fortunately lots of artists in competition with me have flash sites...
I can think of a few very well known ones who don't even show up.

- --
http://petertodd.ca
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGQgvE3bMhDbI9xWQRAteZAKCiFf5hCofT6gBN43d9UgrvcxMzdQCfVoYT
jmqCgmjrCf9s/Zh2dkPgVe0=
=ZBLg
-----END PGP SIGNATURE-----

2007\05\09@165128 by wouter van ooijen

face picon face
> Flash sites are worse, they never show up.  Then again, they
> don't deserve to.

Don't tell anyone! I would loose a significant part of my sales when the
big Dutch electronics sellers would make a google-accessible website.
(no need to make it google-optimal, just accessible would be enough).

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\09@165128 by wouter van ooijen

face picon face
> I don't think this is google being bought, I think it's just
> sites that have learned how to manipulate the search engines.

and google engineers that have not yet payed attention to this. I think
such sites will be punished when the googlemasters find out and find the
time to do something about it. there are some interesting stories on the
web of 'google boost services' that indeed boosted your website, up to
the moment google had analysed their techniques.

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\09@165131 by wouter van ooijen

face picon face
> and the link count for the search on those websites
> (which does not)

which link count?

> Of course this is far from perfect, but it busts the myth
> about 'relevant results sorted by ranking determined by link count'.

I have never heared of this myth. AFAIK it is some form of *weighted*
counting of the links. There are articles on the web that explain this
to some depth.

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\09@165131 by wouter van ooijen

face picon face
> If I want to 'blink a led', I'm glad Wouter forked over the
> cash, otherwise I'd never find it in the 1.5 million pages
> that show up.

I am the first (I never saw that one:)! That must have been a hughe
payment, I want that money back!

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\09@172610 by Jinx

face picon face

> Yeah, cleaners get all the dirt...

;-)

2007\05\09@181919 by Nate Duehr

face
flavicon
face

> Harold Hallikainen <harold <at> hallikainen.org> writes:

> is nothing sinister about it (but I'd still want to know why those
> ieee.org
> 'sell' pages get ranked so high without having content accessible to usual
> browsers w/o paying).

IEEE probably allows the GoogleBot into the PDF's... somehow.  As someone
pointed out, masquerading as the GoogleBot doesn't work...

Doing that is against Google's policy, and someone COULD report them and
get all of IEEE pulled, if they wanted to...

There is a "report a website breaking our terms" in the webmaster tools at
Google, if someone's got the cojones to report 'em as "bad guys".

(Personally I don't care enough.)

--
Nate Duehr, WY0X

2007\05\09@185152 by Gerhard Fiedler

picon face
Nate Duehr wrote:

>> Harold Hallikainen <harold <at> hallikainen.org> writes:
>
>> is nothing sinister about it (but I'd still want to know why those
>> ieee.org 'sell' pages get ranked so high without having content
>> accessible to usual browsers w/o paying).
>
> IEEE probably allows the GoogleBot into the PDF's... somehow.  As
> someone pointed out, masquerading as the GoogleBot doesn't work...

There's nothing very special about a GoogleBot that IEEE could know and
nobody else. And if just onebody else knew about it, all would have free
access... so I don't think this is a go.

And -- as mentioned here before, Google /knows/ that the IEEE pages it
indexes are not public. I still find it quite odd to find non-public
content indexed and probably never will get used to this. That goes for
subscription-only news sites, too.

I don't think this is because they are so popular. I'm sure the ones who
have access to IEEE or would buy their papers based on a Google search are
a dwindling minority of Google users. All others just get annoyed by this.
There must be another reason... and as always, money leads the suspicions.

Gerhard

2007\05\09@190917 by Nate Duehr

face
flavicon
face
As a side-comment, you guys *do* know about the Google webmaster tools,
right?

Plenty of info on this site to explain how to move a page up in the
rankings, etc...

https://www.google.com/webmasters/tools/docs/en/about.html

--
Nate Duehr, EraseMEnatespam_OUTspamTakeThisOuTnatetech.com

2007\05\10@112344 by Tony Smith

picon face
> As a side-comment, you guys *do* know about the Google
> webmaster tools, right?
>
> Plenty of info on this site to explain how to move a page up
> in the rankings, etc...
>
> www.google.com/webmasters/tools/docs/en/about.html
>
> --
> Nate Duehr, natespamspam_OUTnatetech.com


Where's the fun in that?  It's far easier to think up a conspiracy theory -
"I'm being oppressed by the man!" than to actually do any work.

It's not as if it's hard work anyway, what Google recommends is what most
people recommend as good site design.  Fancy that.  I ran a site years back
that made it to number one in its niche, I didn't do anything special.

Tony

2007\05\10@113509 by Tony Smith

picon face
{Quote hidden}

Why don't you use robots.txt like you're supposed to?

That's exactly the sort of thing that gets you kicked out of Google.
Serving up a different result to the Google spider than what a browser would
see means you're trying to rig the system.  Browsers see spam, spider sees
keywords.  Tsk, naughty!

Anyway, I doubt the spider runs Javascript, so it may not have even noticed
unless you were doing it server-side.

Tony

2007\05\10@113921 by Tony Smith

picon face
{Quote hidden}

I remember ages ago there was talk that a lot of the web would never get
indexed, being subscriber-only info, like IEEE, scientific journals, etc.
The proposed solution was if you wanted your site indexed, you gave Google a
subscription, and the deal was it didn't cache the pages.

Doesn't the google News work like that?

It's probably something simple like the site sees you aren't logged in or
don't have a cookie set (as would happen with a spider), and displays a
preview of the PDF.  It's that preview that shows up in the search.  I know
sites were switching content for the spider, but that can get you banned.

I know it won't happen, but has anyone asked the IEEE webmaster -
<http://ieee.org/web/services/general/webmaster.html>?  I presume these's a
member or two here.  $10 says money doesn't change hands.

Tony

2007\05\10@114901 by wouter van ooijen

face picon face
> I ran a site years back that made it to number one in
> its niche, I didn't do anything special.

Nothing, besides offering content that other people felt worth linking
to, not hiding your content in flash or php pages, not using funny
tricks...

Those *&*&^(^()$# people at google actually try to emulate the popular
vote! Blame on them, why can't they think like any normal commercial
outfit?

:)

Wouter van Ooijen

-- -------------------------------------------
Van Ooijen Technische Informatica: http://www.voti.nl
consultancy, development, PICmicro products
docent Hogeschool van Utrecht: http://www.voti.nl/hvu



2007\05\10@130157 by Paul Hutchinson

picon face
> -----Original Message-----
> From: @spam@piclist-bouncesKILLspamspammit.edu On Behalf Of Tony Smith
> Sent: Thursday, May 10, 2007 11:39 AM
>
> I remember ages ago there was talk that a lot of the web
> would never get indexed, being subscriber-only info, like
> IEEE, scientific journals, etc. The proposed solution was
> if you wanted your site indexed, you gave Google a
> subscription, and the deal was it didn't cache the pages.

The last time this issue came up here I wanted to mention something I
remember reading about it from quite a while back. However I couldn't locate
what I had read about Google and standards bodies so, I didn't mention it. I
think it was from around 2001 and in one of the trade rags, EET I think.

Any way, what I remember reading was that Google was trying to get standards
bodies to allow them to index there documents for search results while
protecting the standards bodies income source. While I can't find the old
article It seems to match up with the information at Google Scholar.
<http://scholar.google.com/intl/en/scholar/publishers.html>

Google Scholar and Google books take the attitude that it is better to let
you know that the information is out there even if you can't access the
complete information for free. Personally it doesn't bother me to see a
standards body restricted access document listed first in the results. If I
really need to see the information I'll head to a library or have my
employer buy it.

Paul  

{Quote hidden}

2007\05\10@133020 by Paul Hutchinson

picon face
> -----Original Message-----
> From: KILLspampiclist-bouncesKILLspamspammit.edu On Behalf Of William Chops Westfield
> Sent: Wednesday, May 09, 2007 11:17 AM
>
> >> Would you please provide a search term(s) that produces
> >> a Google result that illustrates your point.
> >>
> It's been particularly annoying that you USED to be able to
> enter a part number of a relatively obscure chip and have the
> manufacture's spec sheet show up high in the rankings.  Now,
> you wind up with a whole bunch of data sheet subscription
> services that want money before they'll feed up the datasheet,
> and a bunch of sales sites,  good portion of which don't actually
> HAVE the datasheet or the part you requested anyway.  For instance, try
> "tmp47p443"...

Great something to try, thank you. I seem to remember the last time I tried
this a year or more ago it was frustrating. Google on tmp47p443 and there it
is, only data-sheet archives and aggregators. First hit is DigChip.com a
member only site but, it does show it to be Toshiba part. Go to Toshiba's
web site and search, not found, that explains why the manufacturers site
didn't come up in the Google search. Second result is clearly an obsolete
parts sales site so I skip it. The 3rd result is datasheets.org.uk I try it
and success, the tmp47p443 data sheet without any signup.

That was too easy, I do seem to recall that it was harder finding data
sheets for obsolete parts. I check the date on the datasheet and see it's
from 2000 so I try an older part. The MC146823 was obsolete more than 10
years ago but, there's a copy of the data book pages at the first result,
datasheet4u.com. I try some more:
28c16 - 3rd result alldatasheet.com
MC146818 - 1st result datasheetcatalog.com
ad7533 - 1st result original manufacturer Analog Devices

Well either my memory of how hard it was to find datasheets for obsolete
parts is wrong or, Google and the free datasheet sites have improved things.

In any case it's worth noting that these sites have many old datasheets for
free.
datasheets.org.uk
datasheet4u.com
alldatasheet.com
datasheetcatalog.com
datasheetarchive.com

Paul

>
> I don't think this is google being bought, I think it's just
> sites that have learned how to manipulate the search engines.
>
> BillW

2007\05\10@153424 by Peter Todd

picon face
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, May 11, 2007 at 01:18:12AM +1000, Tony Smith wrote:
{Quote hidden}

No, the situation is more complex than that.

See, basically I'm presenting a page that google can't understand very
well. That's because each page has, for purely asthetic reasons, a whole
pile of server logs. Google thinks that text is important, istead of
decoration, and shows it on search results.

In a sense it is. I get a *lot* of people coming to my webpage, like 50
or so per day, from google searches that match the zillions of worms and
stuff trying to break into my system. Heck, I sometimes get calls from
very confused sysadmins who somehow think that because the worm's IP
shows up on my server logs too I'm trying to hack into their system.

> That's exactly the sort of thing that gets you kicked out of Google.
> Serving up a different result to the Google spider than what a browser would
> see means you're trying to rig the system.  Browsers see spam, spider sees
> keywords.  Tsk, naughty!

Only in this case I'm actually trying to *prevent* google from seeing
keywords.

> Anyway, I doubt the spider runs Javascript, so it may not have even noticed
> unless you were doing it server-side.

Which is exactly it... I do do it server-side, with a very simple php
line that's literally:

<?= print(`head -n 100 /var/log/apache2/petertodd.ca-access.log`); ?>

But doing it with javascript is really a very good idea...

- --
http://petertodd.ca
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD4DBQFGQ3O9pEFN739thowRAmuGAJ9kvOUMZ7xqSSuOzOyQYzfan2F+YQCWJLCd
z4UxXeL668yY/DRI7haQYQ==
=pK7w
-----END PGP SIGNATURE-----

2007\05\10@164530 by Nate Duehr
face
flavicon
face


> For awhile I had some code that would detect the google spider and
> simply disable that stuff. But I noticed that every new page I put up
> would work... then about a week or two later that log crap would show up
> again.

Easier just to go to their webmaster site and put in a request to them to
not index your site, if that was the goal.

They also honor (unlike some unscrupulous crappy search engines) the
robots.txt file, just fine.  In fact I threw up a robots.txt file to stop
them and others from indexing it while it was on low-bandwidth, and their
webmaster tools still show it blocked now that robots.txt was removed a
month ago... don't care, but just pointing out how "effectively" the
follow that file..

--
Nate Duehr, RemoveMEnateTakeThisOuTspamnatetech.com

2007\05\10@173254 by Harold Hallikainen

face
flavicon
face

{Quote hidden}

The seem to ignore crawl-delay, which can really eat up bandwidth. It'd be
nice if they only requested a page a minute instead of one every second or
so...

Harold

--
FCC Rules Updated Daily at http://www.hallikainen.com - Advertising
opportunities available!

2007\05\10@183359 by Peter P.

picon face
Ookay, here we go again: blog for $$:

www.theglobeandmail.com/servlet/story/RTGAM.20070509.wgtblogvertise0510/
BNStory/GlobeTQ/home

Google says to that: "Look, if you want to buy links for traffic, fine. Just
don't make it so they affect search engines".

Peter P.


2007\05\10@194458 by Gerhard Fiedler

picon face
Paul Hutchinson wrote:

> Google Scholar and Google books take the attitude that it is better to
> let you know that the information is out there even if you can't access
> the complete information for free. Personally it doesn't bother me to
> see a standards body restricted access document listed first in the
> results. If I really need to see the information I'll head to a library
> or have my employer buy it.

I agree with those results on Google Scholar and Google Books; after all,
that's what they are for. But it really bugs me when the first page of hits
on a search item in plain Google turn out to be irrelevant or paid sources
-- which are irrelevant for me, and for a large majority of users. There
was a time where this was not so.

Whether money changes hands or not, indexing sources where the subscribers
are only a tiny minority among the general Google users is more a nuisance
than a service. It is not much more than advertising for those paid-for
services, without much usefulness otherwise.

I really would like an option to blank those results out. They do me no
good. I found Google was better when they weren't there.

Gerhard

2007\05\10@213039 by William Chops Westfield

face picon face

On May 10, 2007, at 4:44 PM, Gerhard Fiedler wrote:

> But it really bugs me when the first page of hits on a search
> item in plain Google turn out to be irrelevant or paid sources
> which are irrelevant for me, and for a large majority of users.

We're still waiting for an example search.  My example of
"electronic part names" turned out to usually have "real"
results within the first page (though there were also the
sites you complain about...)

BillW

2007\05\10@213039 by William Chops Westfield

face picon face
On May 10, 2007, at 10:30 AM, Paul Hutchinson wrote:

> That was too easy

It's not currently AWFUL, and it would be better if I kept
mental traffic of which of the archives were "members only"
or not.  It doesn't help that there are so many very similar
site names.  I think I've had the worst luck with some "generic"
transistor parts...


> I do seem to recall that it was harder finding data
> sheets for obsolete parts.

Yes, there's a warm place in my heart for manufacturers that
maintain an archive of (at least) datasheets for their obsolete
parts.  When I first bought my TMP micros for (supposedly)
4-bit microcontroller experiments, I was able to download quite
a bit of info from Toshiba, which lulled me into a false sense
of security...

BillW

2007\05\10@221325 by Gerhard Fiedler

picon face
William ChopsWestfield wrote:

> We're still waiting for an example search.  

When you need them they don't come :)  It's usually something more esoteric
than part numbers that brings up those articles. I'll try to remember this
discussion and post some searches when I get those results the next time :)

Gerhard

2007\05\12@101634 by Tony Smith

picon face
> > Why don't you use robots.txt like you're supposed to?
>
> No, the situation is more complex than that.
>
> Only in this case I'm actually trying to *prevent* google
> from seeing keywords.
>
> > Anyway, I doubt the spider runs Javascript, so it may not have even
> > noticed unless you were doing it server-side.
>
> Which is exactly it... I do do it server-side, with a very
> simple php line that's literally:
>
> <?= print(`head -n 100 /var/log/apache2/petertodd.ca-access.log`); ?>
>
> But doing it with javascript is really a very good idea...


What's so complex about it?  Add a line to robots.txt excluding
/var/log/apache2 and you're done.

I realised a while back (but not soon enough) that if I was doing something
'complex' then I was probably doing it wrong.  Of course I had my reasons
for doing so...

BTW, if you want to keep you site in Google, I'd avoid playing with the
spider if I was you.

Tony

2007\05\12@102143 by Tony Smith

picon face
{Quote hidden}

Fixing subscriptions is harder than it looks.

You have the ones mentioned which are paid only, but Google magically
manages to read.

Then there are situations when the content is free, but requires free
registration, such as YahooGroups and some newspapers.  You have
http://www.newscientist.com which provides page 1 of some articles, but you need to
subscribe to see the rest.  Newspapers like http://www.smh.com.au show articles for
10 days, content older than this requires payment.

Removing subscriber links will remove a lot of stuff.  Ok, YahooGroups stuff
usually doesn't show up, but it would be nice if it did.

Like Paul, I'd prefer to see that the information exists, but perhap Google
can add a "Subsciber" or "$ubscriber" code to the result.

Tony

2007\05\12@102211 by Tony Smith

picon face
> Ookay, here we go again: blog for $$:
>
> www.theglobeandmail.com/servlet/story/RTGAM.20070509.wg
> tblogvertise0510/
> BNStory/GlobeTQ/home
>
> Google says to that: "Look, if you want to buy links for
> traffic, fine. Just don't make it so they affect search engines".
>
> Peter P.


Ok, but that's not "Give Google money to be number #1, OMG it's a
conspiracy!!!", that's just advertorials.  Big deal.  Move along, nothing to
see.

Toss in normal advertising, astroturfing and viral marketing for more fun &
games.

Newspapers and magazines have done this for years.  Whether they say as much
varies, for one Australian reviewer the joke was the review depended on the
bottle of wine you sent him.

Theoretically, the situation is self-correcting.  If a popular site starts
adding a lot of paid reviews, it loses its reputation (people stop linking)
and it slides down the scale.  Google shouldn't do anything special with
these sites.

Look at something like http://www.tomshardware.com, which I haven't bothered to
look at for years.  I stopped when they'd break a story over 50 pages to
increase the ad hits.  I'd never link to it, (presumably neither would
others), and I can't recall the last time it showed up in a hardware search.
Funny that.

Besides, you can give Google money now to show up on the first page, so why
do this?

Tony

PS: (anyone bothered to ask IEEE how Google indexes their PDFs yet?)

2007\05\12@171949 by Gerhard Fiedler

picon face
Tony Smith wrote:

> PS: (anyone bothered to ask IEEE how Google indexes their PDFs yet?)

I don't really care about how they /index/ them (unless it's a method I can
hijack :) -- I would like to know how I can make Google not /show/ me those
results I don't have access to.

Gerhard

2007\05\12@172505 by Gerhard Fiedler

picon face
Tony Smith wrote:

> Like Paul, I'd prefer to see that the information exists, but perhap Google
> can add a "Subsciber" or "$ubscriber" code to the result.

I think it boils down to if Google can cache it, it's ok. For the rest
there should be an option to blank it out.

BTW, IMO it doesn't really help to know that "the information exists",
because in order to verify what kind of information that is that exists,
you need to have access to the content. If you don't have, you still don't
know much.

Gerhard

2007\05\13@002451 by Tony Smith

picon face
> > PS: (anyone bothered to ask IEEE how Google indexes their PDFs yet?)
>
> I don't really care about how they /index/ them (unless it's
> a method I can hijack :) -- I would like to know how I can
> make Google not /show/ me those results I don't have access to.
>
> Gerhard


Well, if Google can't index them (as they're not public), then they won't
show up, thus solving your problem.

Since they do show up, (OMG conspiracy!!!), it's a problem.  So, either
money changed hands, IEEE webmaster is playing with spiders, or something
else is happening.  Ask the IEEE and put us all at ease.  Of course the IEEE
will lie, so maybe it's not worth it.

Still waiting for a search that shows $$$ is changing hands.  Funny how the
evidence always seems to vanish just as someone asks for it...  Looks like
Googles' helicopters (painted with stripes matching the logo, naturally)
have been busy.  

Tony

More... (looser matching)
- Last day of these posts
- In 2007 , 2008 only
- Today
- New search...