|
本帖最后由 hardrock 于 2014-12-15 21:17 编辑
robots.txt文件要放在网站根目录下,最基本的检查方法就是用你的域名后面直接跟上robots.txt访问,如果能访问,那放置的位置就对了。
找到份代码,- User-agent: *
- Disallow: /cgi-bin/
- Disallow: /wp-admin/
- Disallow: /wp-content/cache/
- Disallow: /wp-content/languages/
- Disallow: /wp-content/plugins/
- Disallow: /wp-content/themes/
- Disallow: /wp-content/upgrade/
- Disallow: /wp-includes/
- Disallow: /comments/
- Disallow: /category/
- Disallow: /tag/
- Disallow: /page/
- Disallow: /feed/
- Disallow: /author/
- Disallow: /trackback/
- Disallow: /2010/
- Disallow: /2011/
- Disallow: /2012/
- Disallow: /2013/
- Disallow: /*/feed/
- Disallow: /*/trackback/
- Disallow: /*?
- Disallow: /*/*?
- Disallow: /*/*/*?
- Disallow: /*.php$
- Disallow: /*.js$
- Disallow: /*.inc$
- Disallow: /*.css$
-
- # Google Image
- User-agent: Googlebot-Image
- Disallow:
- Allow: /
-
- # Google AdSense
- User-agent: Mediapartners-Google*
- Disallow:
- Allow: /
-
- # digg mirror
- User-agent: duggmirror
- Disallow: /
-
- # Alexa archiver
- User-agent: ia_archiver
- Disallow: /
-
- Sitemap:http://www.yuhua.org/sitemap.xml
- Sitemap:http://www.yuhua.org/sitemap_baidu.xml
复制代码 问题是这份代码适用于中文站用于百度,我是做英文站要适用于google, 以上代码怎样改成适用英文站的?我不懂代码,所以找人看看
主要疑问是31----47行的代码,既然是英文站,这几行代码应该是允许的吧?中文站才禁止抓取?
WordPress官方robots.txt书写规则 http://www.guance.com/387.html
http://www.advertcn.com/forum.ph ... 75&highlight=robots
http://www.advertcn.com/forum.ph ... 81&highlight=robots
https://support.google.com/webmasters/answer/35769?hl=zh-Hans
https://developers.google.com/we ... ndex/docs/faq?csw=1 https://support.google.com/webmasters/answer/6062608?rd=1
robots.txt位置和.htaccess 文件位置http://www.thegrouplet.com/thread-115989-1-1.html
主域名作为域名做站,A.com , 建立目录A , wordpress文件复制到/public_html/目录A/ , robots.txt放到/public_html/ , 网站访问A.com/robots.txt , public_html/.htaccess
add domains 绑定域名B.com指向目录B, cpanel自动建立目录B ,wordpress文件复制到/public_html/目录B/ , robots.txt放到/public_html/B/ , 网站访问B.com/robots.txt , public_html/B/.htaccess
.htaccess文件一定和index.php同目录。
|
|