CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Installing CiviCRM »
  • Drupal Installations (Moderator: Piotr Szotkowski) »
  • stop search engine indexing of personal information
Pages: [1]

Author Topic: stop search engine indexing of personal information  (Read 3171 times)

karunadave

  • I post occasionally
  • **
  • Posts: 50
  • Karma: 0
    • Karuna Dev
  • CiviCRM version: 4.4.4
  • CMS version: Drupal 6.30 or Drupal 7.26 Drush 6.2.0
  • MySQL version: 5.5.35-cll - MySQL Community Server (GPL)
  • PHP version: 5.3.21
stop search engine indexing of personal information
November 09, 2008, 12:31:00 pm
Hi

I have had some members of my community complain about finding their personal information in Google because of our CiviCRM site.  I don't much like my personal information exposed in such a way either.

To address this, would you consider including in the online documentation installation instructions a note about robots.txt.

Here is a sample from a Drupal install.

# Paths (clean URLs)
Disallow: /user/
Disallow: /civicrm/
# Paths (no clean URLs)
Disallow: /?q=user/
Disallow: /?q=civicrm/


Code: [Select]
# $Id: robots.txt,v 1.7.2.1 2007/03/23 18:57:07 drumm Exp $
#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:    http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html
#
# For syntax checking, see:
# http://www.sxw.org.uk/computing/robots/check.html

User-agent: *
Crawl-delay: 10
# Directories
Disallow: /database/
Disallow: /includes/
Disallow: /misc/
Disallow: /modules/
Disallow: /sites/
Disallow: /themes/
Disallow: /scripts/
Disallow: /updates/
Disallow: /profiles/
# Files
Disallow: /xmlrpc.php
Disallow: /cron.php
Disallow: /update.php
Disallow: /install.php
Disallow: /INSTALL.txt
Disallow: /INSTALL.mysql.txt
Disallow: /INSTALL.pgsql.txt
Disallow: /CHANGELOG.txt
Disallow: /MAINTAINERS.txt
Disallow: /LICENSE.txt
Disallow: /UPGRADE.txt
# Paths (clean URLs)
Disallow: /admin/
Disallow: /aggregator/
Disallow: /comment/reply/
Disallow: /contact/
Disallow: /logout/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
Disallow: /user/
Disallow: /civicrm/
# Paths (no clean URLs)
Disallow: /?q=admin/
Disallow: /?q=aggregator/
Disallow: /?q=comment/reply/
Disallow: /?q=contact/
Disallow: /?q=logout/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/
Disallow: /?q=user/
Disallow: /?q=civicrm/


Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: stop search engine indexing of personal information
November 09, 2008, 03:27:11 pm

the online documentation is community editable. you might want to add this information as an entry in the online FAQ and link to this page from the installation docs

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

loriguidos

  • I’m new here
  • *
  • Posts: 4
  • Karma: 0
  • CiviCRM version: 3.4.7
  • CMS version: Drupal
  • MySQL version: unknown
  • PHP version: unknown
Re: stop search engine indexing of personal information
June 22, 2012, 06:36:04 pm
Hi,

I am a newbie and trying to learn how to limit my users content to all search engines.

Unfortunately, it was not clear to me that enabling the profile was going to expose all my user's
personal information to Google's crawl.  I have changed the profile information back but the
search engine's have published all my user's information...

I have been told by Attracta and seen other posts online that the robot.txt is not the place to limit
the google crawl, but that this information needs to be added to the page in drupal and civicrm.

Here is the text that I have been told to add:
<meta name="robots" content="noindex, nofollow, noarchive"/>

Problem is I don't know how to add this to my CiviCRM pages, which is where this
content had been published from.

Can someone please point me to an exact page that will walk me through this process.

My users are irrate with me and I am ticked off that there wasn't a big warning screen
that comes up to explain that Google is going to expose all my user's personal information
to the world...


Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: stop search engine indexing of personal information
June 22, 2012, 07:23:31 pm

The CiviCRM book has a fair amount of information on profiles and permission: http://book.civicrm.org/

You might want to ensure that your profiles are not exposed to the anonymous user and hence not crawlable by google and other search engines

alternatively you can check the syntax of the robots.txt file and see how you can exclude all urls with civicrm in it

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

loriguidos

  • I’m new here
  • *
  • Posts: 4
  • Karma: 0
  • CiviCRM version: 3.4.7
  • CMS version: Drupal
  • MySQL version: unknown
  • PHP version: unknown
Re: stop search engine indexing of personal information
June 29, 2012, 08:26:23 am
Donald and everyone else,

I fixed the profile.

It is no longer revealing the information.

But the google crawler holds some of this content since they
"peel off" some of the content from this previously exposed
profile.

The results of the crawl is now sitting out on google's search engine.

I am trying to clear google's cache or search engine results.

I have changed my robot.txt file to exclude the URL's -- but this
is only good for future crawls -- not past crawls.

If anyone can help me figure out how to add this information
<meta name="robots" content="noindex, nofollow, noarchive"/>
to the index file of my Drupal / Civicrm file.

According to a number of sources, once I add this content, I
can use Google Web Tools to re-crawl the site and this should
clear the older content.

All I see on Drupal is an index.php file not an index.htm file...

Thanks,

Lori

sagraphics

  • I post occasionally
  • **
  • Posts: 40
  • Karma: -1
  • CiviCRM version: 4.0.8
  • CMS version: Joomla 1.7
  • MySQL version: 5.1.57
  • PHP version: 5.2.17
Re: stop search engine indexing of personal information
September 17, 2012, 10:19:24 am
I have been having the same issue with indexing.  I need to allow google to crawl organisation listings but not individual - any ideas?

CiviTeacher.com

  • I live on this forum
  • *****
  • Posts: 1282
  • Karma: 118
    • CiviTeacher
  • CiviCRM version: 3.4 - 4.5
  • CMS version: Drupal 6&7, Wordpress
  • MySQL version: 5.1 - 5.5
  • PHP version: 5.2 - 5.4
Re: stop search engine indexing of personal information
September 19, 2012, 05:09:42 am
Careful with the profile visibility settings.

If you want to ask Google to remove certain indexed pages, you can do so.  Google reserves the right to ignore your request.



http://support.google.com/webmasters/bin/answer.py?hl=en&answer=164734

Try CiviTeacher: the online video tutorial CiviCRM learning library.

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Installing CiviCRM »
  • Drupal Installations (Moderator: Piotr Szotkowski) »
  • stop search engine indexing of personal information

This forum was archived on 2017-11-26.