English Amiga Board


Go Back   English Amiga Board > abime.net - Home Projects > project.EAB

 
 
Thread Tools
Old 22 February 2024, 00:39   #21
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,642
Search engines are overrated. Use a search box (that isn't itself a search engine). Too many vested interests, as opposed to "search the database for these terms".

"Search" engines don't want you to find things that are not products that they can say "ta-da! found it!" and put themselves as a middleman of, and claim to the seller that they let you find their product (from ad words) and so sell ad words to the seller.

It makes no sense (to suits) that a product should be free and useful. If it's free, it should be as useless to the user as they can make it, as long as it's useful to THEM.

Google got really crappy a long time ago. Try DuckDuckGo, it can't be worse. Maybe it's time for WebCrawlers again.
Photon is offline  
Old 15 March 2024, 05:09   #22
modrobert
old bearded fool
 
modrobert's Avatar
 
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 779
After testing Google search both logged in and via proxy from other country not logged in, it seems EAB forum is not indexed anymore.

I tried with "amiga site:eab.abime.net" and get two results from attachments, but nothing from the forum posts.

When not restricting search with "site:" I get plenty of results from other Amiga forum sites, but nothing from EAB.

Is this intentional (e.g. blocking google crawl bots, or robots.txt)?

I checked the robots.txt file for eab.abime.net, looks like this:

Code:
User-agent: GPTBot
User-agent: CCBot
User-agent: ChatGPT-User
Disallow: /
User-agent: *
Allow: /
Sitemap: https://eab.abime.net/sitemap_index.xml.gz
Perhaps the crawl bots get confused by "Sitemap:"?

Last edited by modrobert; 15 March 2024 at 05:18.
modrobert is offline  
Old 15 March 2024, 12:04   #23
pixie
Registered User
 
pixie's Avatar
 
Join Date: May 2020
Location: Figueira da Foz
Posts: 397
I had a simple script that would allow me to search easily over eab, now it doesn't work anymore on google, so now it's the only use I have for edge and bing (that because even if i change the search for bing.com it goes trough google)
pixie is offline  
Old 15 March 2024, 14:37   #24
daxb
Registered User
 
Join Date: Oct 2009
Location: Germany
Posts: 3,306
IMO there is no reason for using Google search because there are better alternatives. Just some examples: https://www.privacytools.io/private-search
daxb is offline  
Old 15 March 2024, 18:24   #25
pixie
Registered User
 
pixie's Avatar
 
Join Date: May 2020
Location: Figueira da Foz
Posts: 397
Quote:
Originally Posted by daxb View Post
IMO there is no reason for using Google search because there are better alternatives. Just some examples: https://www.privacytools.io/private-search
It's a matter of working vs not working at all.. before, if you set site:abime.net it worked like a charm, now it ceased to work. I won't go on if there's better search or not, since I don't know enough...
pixie is offline  
Old 18 March 2024, 19:12   #26
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,642
If there's an indexing issue, you should see it in your Google control panel. Possibly you could try a recrawl.

Probably some of the other search engines get some of their search results from Google, so now we can find out which do. Brave and DuckDuckGo don't, it seems. (Or it could be so, but stored results.)

It certainly seems to be a problem specific to Google, so the WebMaster could check his inbox for some "requires action" email maybe? Or perhaps an HTML validation error that the Google search engine specifically doesn't like. But the most likely reason is always blocking the crawl path to searchable content (i.e. threads) in .htaccess.

Lastly, we could try to play the "I'm a company, I'm more important to commercial search engines!" card. Register a Page on Facebook with EAB as a company and put the web address. Commercial search engines should be all over that.
Photon is offline  
Old 18 March 2024, 23:08   #27
Bren McGuire
Registered User
 
Bren McGuire's Avatar
 
Join Date: Nov 2019
Location: Croydon
Posts: 587
mate what do you mean i don't get it like where do i go to do a generic search? if i understand you correctly you talk about going to a specific site and using their search boxes but that's not always a possibility and it makes no sense the idea of the search engine is to look at many sources for the same info to try to provide you the best (or the most) answers

with that said what can everyone recommend to use? i don't really like duckduckgo

Quote:
Originally Posted by Photon View Post
Search engines are overrated. Use a search box (that isn't itself a search engine).
Bren McGuire is offline  
Old 19 March 2024, 01:15   #28
Photon
Moderator
 
Photon's Avatar
 
Join Date: Nov 2004
Location: Eksjö / Sweden
Posts: 5,642
Quote:
Originally Posted by Bren McGuire View Post
mate what do you mean i don't get it like where do i go to do a generic search? if i understand you correctly you talk about going to a specific site and using their search boxes but that's not always a possibility and it makes no sense the idea of the search engine is to look at many sources for the same info to try to provide you the best (or the most) answers

with that said what can everyone recommend to use? i don't really like duckduckgo
I mean that some so-called (but commercial) search engines such as those from Microsoft, Google, and Apple can at any time change the priority of results from "articles online containing what I searched for at the top" to "oh, but first: there was a product/service name or something close to it in one of your search words, and we want to position ourselves between the user and seller so we can approach sellers and say that our service is worth something to them, and find ways to sell search result priority and ad slots to them as a product".

Something like "we herd ppl search for x, and we herd u sell x, so here's a proposal".

I also mean that once you know a site that has lots of useful information, like this one, Stackoverflow, maybe even Reddit or Youtube, you can go to that site and use its search box to narrow your results. They might still not be prioritized by actual relevance. No search engine does that today. "One does not simply search the database for the words." meme.

Except a few sites. They still do that. Then you find what you're actually looking for, instead of this that, the other thing, and how about this. :/

Even worse is the "you have to login first, so we know what you like from the data we harvested, so that we can doctor the results and you can never find anything we don't think you don't like and so can never find anything new even if you typed in the exact words for what you want to find".

As for the problem to solve at the moment, crystal ball says EAB webmaster once submitted a directory of EAB to Google, and now Google doesn't support that anymore and the directory must be updated, or the Google account associated with the upload was removed or something along those lines.

I've recently found DuckDuckGo to give me better and more useful results than Google. You also get to click or copy the actual link, instead of a referral link with tracking info that's useless to share. Sometimes the referring takes a few seconds due to poor service by Google.

Give alternatives a try - at least when Google gives you sh#t. Google is far from the "best and only". Maybe 8 years ago, but not now.
Photon is offline  
Old 19 March 2024, 15:36   #29
RCK
Administrator
 
RCK's Avatar
 
Join Date: Feb 2001
Location: Paris / France
Age: 45
Posts: 3,096
Quote:
Originally Posted by modrobert View Post
After testing Google search both logged in and via proxy from other country not logged in, it seems EAB forum is not indexed anymore.

I tried with "amiga site:eab.abime.net" and get two results from attachments, but nothing from the forum posts.

When not restricting search with "site:" I get plenty of results from other Amiga forum sites, but nothing from EAB.

Is this intentional (e.g. blocking google crawl bots, or robots.txt)?

I checked the robots.txt file for eab.abime.net, looks like this:

Code:
User-agent: GPTBot
User-agent: CCBot
User-agent: ChatGPT-User
Disallow: /
User-agent: *
Allow: /
Sitemap: https://eab.abime.net/sitemap_index.xml.gz
Perhaps the crawl bots get confused by "Sitemap:"?
This is not intentional
I tried everything I can to get reindexed (including the generation of the sitemap who is correctly find by googlebot)

They just decided to drop everything from EAB and I don't know why, the only info I have is "explored, but not indexed" for 236k url (threads).

Last edited by RCK; 29 March 2024 at 11:23.
RCK is offline  
Old 19 March 2024, 16:35   #30
TCD
HOL/FTP busy bee

 
TCD's Avatar
 
Join Date: Sep 2006
Location: Germany
Age: 46
Posts: 31,813
I found this:
Quote:
Connectivity or Server issues

Server error impacts page indexation in 2 ways:

Page discovered but not indexed
Page explored but not indexed

They both indicate that Google knows that the page exists but doesn't index it. What could be the cause?

This often happens because Google's bot thinks a Crawl will overload your server. So, it backs off and reschedules the crawl.
https://seomatic.ai/blog/programmati...aster-indexing

Hopefully it will resolve itself with the new server hardware.
TCD is online now  
Old 20 March 2024, 15:59   #31
malko
Ex nihilo nihil
 
malko's Avatar
 
Join Date: Oct 2017
Location: CH
Posts: 4,955
Quote:
Originally Posted by malko View Post
For daily use, my choice went for startpage.com (think it's older than duckduckgo), which like duckduckgo has an interesting privacy browser plugin as well.
Quote:
Originally Posted by daxb View Post
IMO there is no reason for using Google search because there are better alternatives. Just some examples: https://www.privacytools.io/private-search
Based on the provided description I now know that startpage is 10 years older than duckduckgo
malko is offline  
Old 21 March 2024, 08:21   #32
modrobert
old bearded fool
 
modrobert's Avatar
 
Join Date: Jan 2010
Location: Bangkok
Age: 56
Posts: 779
Quote:
Originally Posted by RCK View Post
This is not intentional
I tried everything I can to get reindex (including the generation of the sitemap who is correctly find by googlebot)

They just decided to drop everything from EAB and I don't know why, the only info I have is "explored, but not indexed" for 236k url (threads).
OK, so that sucks. Normally a lot of traffic comes from Google search (when it works), can be like 30% which helps with ad revenue and bringing more users to the site.

Quote:
Originally Posted by TCD View Post
I found this:

https://seomatic.ai/blog/programmati...aster-indexing

Hopefully it will resolve itself with the new server hardware.
After reading the "TL;DR" part of the URL you linked there, found this:
Quote:
Exceeding your site's crawl budget (the number of pages Google bot is allowed to crawl daily) by publishing too many pages at once could lead to unindexed pages.
Perhaps I misunderstand, but if this means Google has a max limit of indexed pages from one site, maybe EAB forum is rejected for that reason?
modrobert is offline  
Old 21 March 2024, 08:33   #33
TCD
HOL/FTP busy bee

 
TCD's Avatar
 
Join Date: Sep 2006
Location: Germany
Age: 46
Posts: 31,813
Quote:
Originally Posted by modrobert View Post
Perhaps I misunderstand, but if this means Google has a max limit of indexed pages from one site, maybe EAB forum is rejected for that reason?
I understand this as Google would only index the max limit of pages and some of them would be left unindexed. It seems like EAB isn't indexed at all. I would say if EAB is still left unindexed after the server upgrade it is a good idea to 'retire' some of the older threads to get under the limit.
TCD is online now  
Old 29 March 2024, 11:24   #34
RCK
Administrator
 
RCK's Avatar
 
Join Date: Feb 2001
Location: Paris / France
Age: 45
Posts: 3,096
Quote:
Originally Posted by TCD View Post
I found this:

https://seomatic.ai/blog/programmati...aster-indexing

Hopefully it will resolve itself with the new server hardware.
I don't think so, because most the time EAB's response time are pretty good.
RCK is offline  
Old 29 March 2024, 11:29   #35
RCK
Administrator
 
RCK's Avatar
 
Join Date: Feb 2001
Location: Paris / France
Age: 45
Posts: 3,096
Google decided to drop EAB from their index without giving any reason.

- Maybe because the threads doesn't use canonical URLs --> I can try to change the URL scheme with the new server.
- Maybe because the site is not responsive --> We need to move to Xenforo and lost real Amiga support.
- Maybe because the world is now mobile first --> We need to move to Xenforo and lost real Amiga support.
- Maybe because we are just old, and only fresh content is interesting for them --> I can't do anything about this.
Attached Thumbnails
Click image for larger version

Name:	Google_Index_Eab_2024.png
Views:	57
Size:	17.6 KB
ID:	81927  
RCK is offline  
Old 29 March 2024, 11:38   #36
RCK
Administrator
 
RCK's Avatar
 
Join Date: Feb 2001
Location: Paris / France
Age: 45
Posts: 3,096
If someone work at google and can ask why EAB is no more indexed, I will be happy to know why
RCK is offline  
Old 29 March 2024, 15:13   #37
SpeedGeek
Moderator
 
SpeedGeek's Avatar
 
Join Date: Dec 2010
Location: Wisconsin USA
Age: 60
Posts: 842
AFAIK, as long as the Google search Bot is not blocked, then it's the Google algorithm which determines the level of indexing.

Could it be that blocking most of the other Bots reduced heavy traffic on EAB, and making the Google algorithm think the site is inactive?
SpeedGeek is offline  
Old 16 April 2024, 14:30   #38
Dunny
Registered User
 
Dunny's Avatar
 
Join Date: Aug 2006
Location: Scunthorpe/United Kingdom
Posts: 2,048
A while ago, Google announced that they were dropping the caching system they used to employ that you could use when a website went down - it's possible that they just deleted their EAB cache (along with their caches of the rest of the web) and only periodically index new data when they feel like it.
Dunny is offline  
Old 16 April 2024, 23:54   #39
nocash
Registered User
 
Join Date: Feb 2016
Location: Homeless
Posts: 64
When searching for 'eab amiga', google does find the forum main page.
But it does seem to think that the sole content of the page is a link to that Hall of Light thing, without showing any further content or description.

Interestingly, google also offers something like "people also search for amiga forums", but, again, it is apparently not aware that eab is a forum itself.

I would very much assume that it's caused by a bug in the html code. An html online validator might help, like this https://validator.w3.org/check?uri=h...Inline&group=0
that thing finds about 200 "errors", but most of them are probably unimportant. I've tried 3-4 different browsers, and none of them has problems about displaying the page body (and search engines are probably even more lax about perfect formatting).

There are other webpages "Powered by vBulletin® Version 3.8.11", and google can display correct descriptions for those pages. So the problem must be some eab-specific customization.

I would try to remove the Hall of Light thing, or try to remove the google-specific scripts.
nocash is offline  
Old 17 April 2024, 01:14   #40
cloverskull
Registered User
 
Join Date: Sep 2018
Location: California
Posts: 341
As an experiment, would it be possible to remove robots.txt for an hour or so and manually trigger Google to reindex EAB? That would be diagnostic in the sense that we'd understand if robots.txt is the issue.
cloverskull is offline  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Google Drive ? Washac project.WHDLoad 0 17 September 2020 11:15
You ever search Google..... Amiga4000 Nostalgia & memories 12 31 January 2020 15:06
Google Search 404 error DH project.EAB 6 07 November 2017 15:52
Search results with BIG thumbnails? rsn8887 HOL suggestions and feedback 1 22 September 2017 02:31
WTF Google? Fingerlickin_B Amiga scene 33 26 June 2015 12:43

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 18:07.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.09945 seconds with 14 queries