Add additional mod_rewrite code to detect direct client requests for dynamic
url
s and externally redirect those requests to the equivalent new static url
s. A 301-Moved Permanently redirect is used to tell search engines to drop your old dynamic
url
s and use the new static ones, and also to redirect visitors who may come back to your site using outdated dynamic
-url
bookmarks.
Considering the above for a moment, one quickly realizes that both the dynamic
and static url
formats must contain all the information needed to reconstruct the
other format. In addition, careful selection of the 'design' of the
static url
s
can save a lot of trouble later, and also save a lot of CPU cycles
which might otherwise be wasted with an inefficient implementation.
An earnest warning
It is not my purpose here to explain all about regular expressions and mod_rewrite; The Apache mod_rewrite documentation
and many other tutorials are readily available on-line to anyone who searches for them (see also the references cited in the Apache Forum Charter
and the tutorials in the Apache forum section of the WebmasterWorld Library
).
Trying
to use mod_rewrite without studying that documentation thoroughly is an
invitation to disaster. Keep in mind that mod_rewrite affects your server configuration
,
and that one single typo or logic error can make your site inaccessible
or quickly ruin your search engine rankings. If you depend on your
site's revenue for your livlihood, intense study is indicated.
That
said, here's an example which should be useful for study, and might
serve as a base from which you can customize your own solution.
Working example
Old dynamic
url
format: /index\.php?product=widget&color=blue&size=small&texture=fuzzy&maker=widgetco
New static url
format: /product/widget/blue/small/fuzzy/widgetco
Mod_rewrite code for use in .htaccess
file:
# Enable mod_rewrite, start rewrite engine
Options +FollowSymLinks
RewriteEngine on
#
# Internally rewrite search engine friendly static url
to dynamic
filepath and query
RewriteRule
^product/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/?$
/index.php?product=$1&color=$2&size=$3&texture=$4&maker=$5
[L]
#
# Externally redirect client requests for old dynamic
url
s to equivalent new static url
s
RewriteCond
%{THE_REQUEST} ^[A-Z]{3,9}\
/index\.php\?product=([^&]+)&color=([^&]+)&size=([^&]+)&texture=([^&]+)&maker=([^\
]+)\ HTTP/
RewriteRule ^index\.php$ http:/<!-- www.bestbbs.com -->/example.com/product/%1/%2/%3/%4/%5? [R=301,L]
Note that the keyword "product" always appears in both the static and dynamic
forms. This is intended to make it simple for mod_rewrite to detect
requests where the above rules need to be applied. Other methods, such
as tesing for file-exists are also possible, but less efficient and
more prone to errors compared to this approach.
Differences between .htaccess
code and httpd.conf or conf.d code
If
you wish to use this code in a <directory> container in the
http.conf or conf.d server configuration files, you will need to add a
leading slash to the patterns in both RewriteRules, i.e. change
"RewriteRule ^index\.php$" to "RewriteRule ^/i
ndex\.php$". Also remember that you will need to restart your server before changes in these server config files take effect.
How this works
Since the spider is now collecting pages including new static links, and all requests for old dynamic
url
s are permanently redirected to the new static url
s, the new url
s will replace the old ones in search results over time.
Location, location, location
In order for the code above to work, it must be placed in the .htaccess
file in the same directory
as the /index.php file. Or it must be placed in a <directory>
container in httpd.conf or conf.d that refers to that directory.
Alternatively, the code can be modified for placement in any
Web-accessible directory above
the /index.php directory by changing the url
-paths used in the regular-expressions patterns for RewriteCond and RewriteRule.
Regular-expressions patterns
Just
one comment on the regular expressions subpatterns used in the code
above. I have avoided using the very easy, very popular, and very
inefficient construct "(.*)/(.*)" in the code. That's because multiple
".*" subpatterns in a regular-expressions pattern are highy ambiguous
and highly inefficient.
The
reason for this is twofold; First, ".*" means "match any number of any
characters". And second, ".*" is 'greedy,' meaning it will match as
many characters as possible. So what happens with a pattern like
"(.*)/(.*)" is that multiple matching attempts must be made before the
requested url
can match the pattern or be rejected, with the number of attempts equal
to (the number of characters between "/" and the end of the requested url
plus two) multiplied by (the number of "(.*)" subpatterns minus one) --
It is easy to make a multiple-"(.*)" pattern that requires dozens or
even hundreds of passes to match or reject a particular requested url
.
Let's
take a short example. Note that the periods are used only to force a
'table' layout on this forum. Bearing in mind that back-reference $1
contains the characters matched into the first parenthesized
sub-pattern, while $2 contains those matched into the second
sub-pattern:
Requested url
: http:/<!-- www.bestbbs.com -->/example.com/abc/def
Local url
-path: abc/def
Rule pattern: ^(.*)/(.*)$
Pass# ¦ $1 value ¦ $2 value ¦ Result
1 ... ¦ abc/def .¦ - ...... ¦ no match
2 ... ¦ abc/de . ¦ f ...... ¦ no match
3 ... ¦ abc/d .. ¦ ef ..... ¦ no match
4 ... ¦ abc/ ... ¦ def .... ¦ no match
5 ... } abc .... ¦ def .... ¦ Match
I'll hazard a guess that many many sites are driven to unnecessary server upgrades every year by this one error alone.
Instead,
I used the unambiguous constructs "([^/]+)", "([^&]+)", and "([^\
]+)". Roughly translated, these mean "match one or more characters not
equal to a slash," "match one or more characters not equal to an
ampersand," and "match one or more characters not equal to a space,"
respectively. The effect is that each of those subpatterns will
'consume' one or more characters from the requested url
, up to the next occurance of the excluded character, thereby allowing the regex parser to match the requested url
to the pattern in one single left-to-right pass.
Common problems
A common problem encountered when implementing static-to-dynamic
url
rewrites is that relative links to images and included CSS files and
external JavaScripts on your pages will become broken. The key is to
remember that it is the client (e.g. the browser) that resolves
relative links; For example, if you are rewriting the url
/product/widget/blue/fuzzy/widgetco to your script, the browser will
see a page called "widgetco", and see a relative link on that page as
being relative to the 'virtual' directory /product/widget/blue/fuzzy/.
The two easiest solutions are to use server-relative or absolute
(canonical) links, or to add additional code to rewrite image, CSS, and
external JS url
s to the correct location. An example would be to use the server-relative link <img src="/l
ogo.gif"> to replace the page-relative link <img src="logo.gif">.
Avoiding testing problems
For both .htaccess
and server config file code, remember to flush your browser cache
before testing any changes; Otherwise, your browser will likely serve
any previously-requested pages from its cache instead of fetching them
from your server. Obviously, in that case, no code on your server can
have any effect on the transaction.
Read first, then write and test
I
hope this post is helpful. If you still have problems after studying
the mod_rewrite documentation and regular expressions tutorials, and
writing and testing your own code, feel free to post relevant entries
from your server error log and ask specific questions in the Apache Server forum
. Please take a few minutes to read the WebmasterWorld Terms of Service
and the Apache Forum Charter
before posting (Thanks!).
Jim
<!-- /post -->
相关推荐
AxureRP-extension-for-Chrome-0.6.2 Chrome Version 33.0.1750.146 m 因为众所周知的原因,此插件不能正常下载,但是可以通过离线安装 使用说明: 1 设置---更多工具--扩展程序 2 打开开发者模式 ...
There are multiple ways to set the URLs that ASP.NET Core binds to on startup. I have an old post about the various options available to you that applies to ASP.NET Core 1.0, but the options available...
ultimate_seo_urls Url规则也就越少
to be clicked even if they are relevant. In this paper, we propose a Dynamic Bayesian Network which aims at providing us with unbiased estimation of the relevance from the click logs. Experiments show...
zencartultimate_seo_urls安装说明.pdf
This is a class to get the folders and urls of the IE.(32KB)
Zencart必备插件之一ultimate_seo_urls,好的SEO优化必须需要规范URL,ultimate_seo_urls为Zencart提供URL重写功能。
Chrome插件-Copy All Urls优雅地保存-开启多个标签页.zip。Copy All Urls属于小而美地工具,如果你每天都需要查看几个固定的网页, Copy All Urls能帮你省很多时间。
资源内容:Copy-All-Urls_v2.10.crx。解决一次性复制谷歌浏览器所有标签网址的问题。释放双手。 使用方法:安装谷歌chrome浏览器》下载文件双击打开安装
的优化插件ultimate_seo_urls的使用,以及站内优化
基于我之前写的代码,这个文件用于储存事先requests得到的图集链接,直接保存到代码同级目录以运行代码
The only problem is, a lot of times, you probably want to allow for deep links into your app using tiny urls, but for the best of reasons, do not want to register for all urls starting with ...
资源来自pypi官网,解压后可用。 资源全名:d8s_urls-0.6.0-py2.py3-none-any.whl
utils-api > urls
语言:English 一键进入您的Google Chrome浏览器网址页面。 一键点击google chrome网址页面,即可快速找到google设置网址。
可以使得站内相关页面地址伪静态化,能兼容于zen cart 1.3.9
原型设计工具Axure RP 谷歌chrome浏览器插件。安装后可以浏览Axure RP工具生成的原型网页文件。
将此字符串添加到您的Huginn的.env ADDITIONAL_GEMS配置中: huginn_mattermost_urls_to_files# when only using this agent gem it should look like this:ADDITIONAL_GEMS = huginn_mattermost_urls_to_files ( ...
打开URL列表源代码:https://github.com/htrinter/Open-Multiple-URLs/ Changelog:https://github.com/htrinter/Open-Multiple-URLs/blob/v1.5.0/CHANGELOG。 md权限:-“制表符”权限以打开新标签页。 该权限显示...
Feeding dynamic localized strings to components using StringResourceModel 84 Using wicket:message to output localized markup 86 Overriding localized resources on a case by case basis 89 ii Table of ...